FEEQUENCY GEEVES 
COEEELATION 


« 

W. PALIN ELDERTON 
e.B E , , F F A* 


THIRD EDITION 


CAMBRIDGE 
AT THE UNIVERSITY PRESS 

1938 



First Edition {Chailcs and Edwin La if ton) 190(> 
Elrt and Edition i „ „ ) H)i37 

Third Edition {(kvnhndtjv (Emurniti/ J*r(\sH) 


nilNTED IN GEEAT BEITAIN 



PREFACE TO SECONDfcEDITION 

ft m 

This book arf)se out of an attempt, published in 1903, to us% 
Professor Pearson’s system of frequency curves for the gradua- 
tion of Mortality Table^i The subject was then unfanuhar to^ 
actuaries, ancf the Institute j^f Actuaries encouraged me to 
write a book on th€ subjectrarranged for its pubhcation and 
reheved me of any expense in connection with it. My gratitude 
is not only for the broad-mindedness with which a professional 
body approached recent research, but also for the help and 
encouragement given to a young, untried and mexperienced 
member of the profession Nor does it end there for when the 
original edition was nearly exhausted the Institute generously 
handed over the copyright and left me with a jfree hand as 
regards the future 

In dealing with frequency curves and curve fitting, new 
matter has been brought m and the order of treatment of the 
curves has been altered so that mam types are less likely to be 
confused with the transition and minor types. A chapter is 
devoted to a comparison of the Pearson curves with the series 
suggested for use by Edgeworth in this country and by many 
continental writers The chapters on correlation, contingency, 
etc. have been largely rewritten and a new chapter on partial 
correlation added. 

The book as it now stands assumes that the reader is famihar 
with the Pnmer of Statistics or some other very elementary 
book. It demands no mathematical knowledge beyond that 
required for the first examination of the Institute of Actuaries 
or the Intermediate Examination for the B.Sc of London 
University The subject is, however, statistical and arith- 
metical, and examples must be worked out if the methods and 
prmciples are to be mastered The reader who goes through a 

( V ) 



bookcfil a practical' subject /jtnd does not work out examples 
IS as certain to ehccfunter imaginary and miss real difficulties 
as he is^to fail to ^obtain any satitl'actory kn/p^ledge of the 
subject. 

Even if a reader d^pos not possess the mathematical equij)- 
raent indicated^ ho can use frequoncjy c.urvos and correlation 
reasonably without it, for the fa(‘i that a (au'V(/ho has found 
agrees with the statistuvs from which the moments were 
.obtained is a proof that, in the particular case, ho has obtained 
proper values for the c;pnstants, even thougli he has not followed 
the mathematical reasoning leifdrng to the'* equations It must 
not be inferred that belief without proof is advisable, but that 
it IS unwise for a practical man to put aside a practical subject 
which he can test practically merely because he cannot follow 
some of the proofs. There is another class of statistical students 
whoso wants may be mentioned f refer to those who^havo 
Jittlo need to study graduation and e>urve littiug in detail, but 
require a knowledge of (‘-orridatiou, probable errors, etc. For 
the sake of these reader's an aln’idginl reading is suggested in 
the Appendix 

Eroquency curves, cuirrolation and sampling form a suhjoid^ 
in which there is still a groat deal to bi^ done, notwithstanding 
the progress that has been made in recent years, Mindi of tins 
work has boon highly mathematie-al, osp(H‘ially when it deals 
with certain small samples or with attempts to find mathe- 
matical cxpiossions for skew correlation Hurfaces/ldiesoaspetd/s 
he outside such a book as this, and, even if we negle<?t tlnun, 
we may Kstill say that tlioro are few snbjoc^ts that offer a riidier 
field for original work, fn this field the reader wdll hud that 
during the past thirty years wo are indebf«od to Professor Karl 
Pearson and his scdiool for miudi of the work that has provcsd 
a success m practice, and anyone writing on the suhjoc^t for 
practical men is bound to follow in liis footsteps. Only those 
who become interested in the subjec*! and study Professor 
Pearson’s original work will appreciate the gi'oat extent of his 
contnbution to statistical sciionce, 

( Vi ) 



I hope that the.numerical examjiles in th\s book, and similar 
arithmetical worK done elsewhere, may i>end to show that 
actuarial statisjjics can b^_ examined in the same way as the 
statistics of biology, anthropology or sociology. May not such 
work add some links to the chain of cont:yiuity and indicate a 
wider law than an actuary studying his own subject exclusively 
might be led to suspect 

As will be readily appreciated, I am chiefly indebted to 
Professor Pearson, but tire indebtedness is of a kind for which 
it IS impossible to offer formahthanks, s^ch thanks would, at 
their best, fail to express the s^nSe of gratitude which prompted 
them 

The revision of the book has been a reminder of much kind 
help received in connection with the first edition from Mr G J 
Lidstone and Mr John Spencer, both of whom read the work m 
a somewhat different form in MS and made many valuable 
suggestions, and from Messrs S Adlard and R L Elderton, 
who then spent much time in reading proofs and suggested 
difficulties that would probably arise and ways of removing 
them In connection with the new edition, Mr H B. Smjther 
has helped with some of the calculations, and both he and 
Mr H T Adlard have read the book m proof, help for which I 
am, indeed, grateful At many stages in the work my sister, 
Miss Ethel M Elderton, has come to my aid, and, bearing in 
mind her experience in teaching the subject as well as her 
practical work, it would have been better if the book had been 
hers and not mine any improvement in this edition is probably 
hers already 

W P E. 

19 COLEMAN STREET 
LONDON, E C 2 


July 1927 




PREFACE TO THIRD EDITION 

r r ^ 

The book ha^ been altered rji many respects/and Chapters 
XI and xn and some of the Appendices have been rewritten 
The notation for moments in the earher editions has been 
retained. Some writers find it helpful to use distinct symbols 
for the ^'theoretical moment"^^ and the^" adjusted statistical 
moment’’. In practical curve fitting the two are equated 
The notation I use treats the latter as identical with the 
former Readers of other work, and especially of contmental 
work, must bear m mmd these and other differences m 
notation 

I £im most grateful to Professor E S Pearson for the help 
and advice he has given me so generously and sympa- 
thetically. It IS also a pleasure to thank Mr H Latham Seal 
for many suggestions and him and Mr H J. Tappenden for 
much help in connection with the proofs 

I hope these kind friends will not be thought to be in any 
way responsible for my shortcomings. 

W P.E. 


OctoUr 1937 




CONTENTS 


Chap ^ I Introdtjctory page 1 

II Frequency DiST:yBUTiONS 4 

III MEraoD OF Moments * 12 

IV Pearson’s System oe Frequency -Curves « 38 

V, Caechtlation * 52 

VI Comparison op ^Tarious Systems op 

Curves 128 

VII. Correlation 141 

VIII Theoretical Distributions. Spurious 

Correlation 156 

IX Correlation op Characters not Quanti- 
• TATiVELY Measurable 170 

X Standard Errors 181 

XI. The Test op Goodness op Fit 200 

XII The Correlation Ratio — Contingency 210 

XIII Partial Correlation 226 

Appendix I Corrections por Moments 231 

II B AND r Functions 237 

III The Equation to the Normal Surface 241 

IV The Integration op some Expres- 

sions connected with the Normal 
Curve op Error • 244 

V Other Methods op Fitting Curves 250 

VI Key to the Actuarial Terms and 

Symbols used 255 

VII Abridged Reading 258 

VIII References, etc 260 

IX Tables 264 

Index 269 

Folding Table op Curves at end 




CI|APTER I 
INTRODUCTORY 

1. The ordinary treatment of probability begins with the 
assumption that the chance that a certain event wiU occur is^ 
known, and proceeds to solve the problems that aris% from 
the combinatK^n of eveifts or the repetition of a particular 
experiment, it proves that a Certain reswilt is more hkely to 
occur from experiment than any other, that a result based on 
a hmited number of trials is unhkely to differ greatly from 
the expected result, and that the proportional deviation from 
the most probable result wiU generally decrease as the number 
of trials is increased 

Exj^feriments can easily be made to show that the theoretical 
method leads to results which can be reahsed m practice when 
the probabilities can be estimated accurately beforehand, for 
example, various trials have been made with com tossing in 
which it has been found that if five coins are tossed together 
and the number of them coming down ‘‘heads” is recorded, 
then the distribution of the cases will agree with the binomial 
expansion (| + 1)“^ as the ordinary theory leads us to expect. 
Sequences of “heads” or “tails” form a series approximating 
to the geometrical progression with a common ratio of and 
the drawing of cards from a pack gives a result closely agreeing 
with the numbers that theoretical work suggests. 

2. It frequently happens, however, that the probabilities 
are not known, and it is impossible to teU whether we are 
dealing with an experiment like coin tossing or sequences or 
card-drawing, in fact, the only thing known is the distribution 
of the number of cases into certain groups, and in these circum- 
stances the inverse problem of tracing the theoretical series 
to which the statistics approximate may become an important 
matter. The difficulty of the subject is increased because 
statistics do not give the theoretical distribution exactly, and 

( I ) 


EFC 



it IS impossible 1 3 II where the differences between the actual 
and theoretical results he To mak^ the position clearer it will 
be welhto restate^the piobleni ana ask whether it is ])ossible , 
to find the theoretical senes to which a scries, resulting from 
a statistical experiment, approxynates. It may bo difficult, 
perhaps impossible, to trace the probabiht-ios^cjorrcspondmg 
"'to a given case, but yet ])racti(uiblo to form a reasonable^ 
opinion of the senes oi’ numbers thaf< might bo reached if 
the experiment could be repeated aif intinito lyimber of times. 
On turmng to the rcrisons whidh make it advisable to find this 
ideal result to which statistics approach, it will be seen that 
the exact elementary probabihties are not ol‘ supreme impor- 
tance, and a reasonable representation of the series is of far 
greater practical value. Wo notice that one of the first objects 
of a statistician or an actuary dealing with statisi-ical work 
is to express the observations in a simple form so that })fa(5tical 
conclusions can bo easily drawn from th(^ figures that have 
boon collected. It the available siatisi.ics fa.ll naturally intio 
fifty or sixty groups, ho has to docside how they (‘.an be ai'ranged 
to bring out tlie important features ol the problem on whic;h 
ho IS working, wlioreas if lie c.au find a lew numbers (ilos(dy 
connected with the original series which (‘.an ho used as an iiuh^x 
to the whole, ho can then give the result in a way t-hat might 
assist comparison with similar siatisticjs, and enable othei's 
who have to deal with the facts to a»ppre<‘.iato the wliole dis- 
tribution more readily than they (umld do if it remaiiKul in its 
original form The statistician has also to supply approximate 
values for intermediaio terms when only a few can be obt ained 
from his experience, or complete or (Continue a series when only 
a part of it is known. In many cases he has to keep the same 
terms as m his original series, but remove the roughnesses ol‘ 
material due to limitations in the number of cases available 
for his investigation, that is, he has to graduate lus data. 

3. In reality these objects arc much alike, for if the statis* 
tical tables can be represented by an algebraic or transcen- 
dental formula, we can rejilaco the whole scries of numbers 

( ^ ) 



by a few values (the constants m the formula) ^bich, if we deal 
systematically with the distributions we meet, facihtate com- 
parison or enable us to supply missing term^, while thO rough- 
ness of the original material can be removed by making a 
suitable formula represent ^he original sj^tistics as nearly as 
possible. If ajFormula is basted on theoretical considerations, 

^ it may also give a solution of the problem in probabilities ^ 
mentioned at the outset, and we see that both the practical 
and theoretical^^ requirenfents can be dealt with at the same 
time, for the smooth series sought by th^ theoretical student 
is the same thing as the formula required for practical work. 

4. The advantages of any system of curves depend on the 
simphcity of the formulae and the number of classes of 
observations that can be dealt with satisfactorily, for a com- 
phcated expression is very httle improvement on the original 
groups'* of statistics, and a system which is not capable of 
general apphcation leaves the statistician in difficulties when- 
ever it breaks down. One other thmg is necessary; if a formula 
is known to be a suitable one, there must be some method of 
finding the arithmetical constants that will give a good agree- 
ment in the particular case Such a method, if it is to be of 
practical use, must be simple, reliable and capable of general 
and systematic apphcation 

A broad idea of the objects to be accomphshed ought to be 
kept clearly before the mind, they are hkely to be forgotten 
because of the large amount of detail necessarily connected 
with the subject It is also important because the advantages 
of systematic treatment are often overlooked, and short cuts 
and rough and ready methods are adopted to the detriment of 
the work, and formulae having no scientific basis and having 
no connection with others suitable to similar cases are some- 
times used in rather haphazard fashion by statisticians The 
consequence is that generahsation is impossible, and where 
a law might be found one can see httle but a great variety of 
attempts by energetic workers to reach their own conclusions 
regardless of the value of comparative statistics 

( 3 ) 


1-2 



CHAPTER II 


FREQUENCY DlfTrRlRUTIQNa 

1. statistics are arranged so as to sliow the number of 
times, or frequency with which, an 'event ha[^)enH in a parti- 
cular way, then the^arrangemfent is a frequency distribution. 
Although some of our results ViD be of wider applicability, we 
shall generally confine our attention to these distributions. 

2. It IS necessary to have a name for the formula used to 
describe such distributions, and the term “ frequency-curve ” 
has been adopted for the purpose. The geometrical progression 
which describes the number of sequences in any diroctnixpori- 
mout, Hucli as coin tossing or dice throwing, is, in the limit, 
a frequency-curve, tlio equation to wliieii is // = 

3. yomo distributions give the number ol' cases lallnig in 
a certain group of values ol' the independent variable, wliile 
others (e.g. Example V of Table 1) give the number of cases 
for an exact value. In the former case tlio e.vact values ol" 
the independent variable to wliich tlie groups correspond must 
be considered, for instance, “exposed to risk at ago a'” includes 
those from x--^iox+ but the number of deaths at duration 
n those from n to n+i. When statistics are reprosonted 
graphically, effect should bo given to these differences, and, to 
bring out the points a little more clearly, the diagrams on 
pp. 6 and (i have boon prepared The drawings of distributions, 
such as those in the diagrams, are called freipioucy polygons 
or histograms. 

4. When statistics give the number of cases for an exact 
value of the independent variable, it is simple to plot them 
m a diagram by drawing ordinates and joining their tops, 
but m the case of groups of values there is a little compheation, 
for we can either draw a rectangle stanchng on tlie entire base 

( 4 ) 



(Example II of diagram) or put in ordmStes at the middle 
points of the bases and t^en join their to;ps (ExampJ[e III) 
The former method gives the correct idea of the amount of 
information conveyed by the statistics, but, for some purposes 
(e g %r seeing the possible •shape of thef^urve), the latter is 
more convement, though it^is open to techmcal objection. 
Cases such as Examples I and IV are best expressed by the 
kind of drawing given, yhile Example III though opsn to 




techmcal objection gives a better indication to most people 
of the shape of the actual distribution than a block 
diagram 

5, The reader is no doubt already famihar with the fact that 
statistics tend towards a smooth series as the total number of 
cases IS increased, and from this it can be seen how naturally 
practical statistics lead to the conception of a frequency-curve 
to describe the smooth distribution that would be obtamed if 
an infimte supply of homogeneous material were available for 
investigation. In other words, such curves would give an 

( 5 ) 








approximation to the total ‘^population’’^of, which the par- 
ticular case investigated \sas a sample. 

• . ^ 


Table I 


^ 

Example I 

Example II * 

Example III 

Example IV 

Example V 

Curtate 

durations 

Withdiawals 
with monthly 
incidence “0” 
in year of exit, 
Principles 
and Methods 
(P 92) 

Ages 

N 

Exposed to 
risk of 
sickness 
(W#,tson, 

M U Tables, 
p 19) ^ 

Existing at 
close of 
observations 
Without 
Profit “Old” 
Assurances , 

Existing at 
close of 
observations 
“Old” 
Annuities 
(females) 

Teims of the 
expaiTeion of 
1000(1 

No of 
terna 

1 

308 

-19 

34 



32 

1 

2 

200 

20-24 

145 



127 

2 

3 

118 

25-29 

156 



232 

3 

4 

69 

30-34 

145 



258 

4 

5 

59 

35-39 

123 



194 

5 

6 

* 44 

40-44 

103 

3 


103 

6 

7 

29 

45-49 

86 

9 


40 

7 

8 

28 

50-54 

71 

42 


11 

8 

, 9 

, 20 

55-59 

55 

111 

29 

2 

9 

10 

21 

G0-G4 

37 

176 

23 

1 

10 

11 

18 

G5-69 

21 

200 

81 


11 

12 

18 

70-74 

13 

193 

151 



13 

12 

75-79 

7 

160 

192 



14 

11 

80-84 

3 

73 

239 



15 

5 

85-89 

1 

26 

157 



IG 

11 

90-94 


6 

93 



17 

7 

95-99 


1 

29 



18 1 

G 

100- 



0 



19 

1 







20 

3 







21 

1 







22 

3 







23 

2 








1,000 


1,000 

1,000 

1,000 

1,000 


True total 
Mean 

Stand aid 
deviation 
Type 

1,308 

4182 

4 1996 

I 


2,995,724 

37 8750 

2 76810 

I 

2,674 

G8 485 

1 771288 

II 

172 

79 400 

1 774894 

VII 

3 998 

1 46215 



6. It may be noticed that a frequency-curve can be inter- 
preted to give a frequency corresponding to every value of the 
independent variable along the whole range of the distribution, 
and will not restrict us to a few more or less arbitrary groups 
as IS necessary with actual statistics. The binomial series and 
geometrical progression do the same when we imagme we are 
dealing with somethmg that can be divided into a very large 




number of groups ffhus, if we mix a large quantity of santi of 
two colours and take out a fixed qxyintity of the mixture and 
record ttie numbef of grams of sand of eithei' colour m each 
drawing, we should obtain a continuous curve from a large 
number of trials. « , ^ 

7. Wo will now define some f'mipoi’tant fnijaitions. When 
'a distribution is arianged according i<o tho progressive values 
of a ^variable characteristic, e g duration, ago, etc , the 
average value of that characteristic'" (not tho ..average of tho 
frequencies) is called*the mean of the distribution, and is given 

it>y 

fa^a+fbXb+fcX<^+ ■+fnxn 
fa+fb+fc+ • +/« 


whore /, “ frequency corresponding to tho value r of 
the variable; thus, in Hxamplo 1, 200 is tho frequomJy cor- 
1 ‘osponding to 2. If we assume infunlesimal ineremonts, the 
mean is given by 



where the limits of f.he integral will bo such as to cover tho 
whole distribution. The moan could also be doscribed as tho 
position of tho ordinate through tho contro of gravity of tho 
distribution (centroid vortical); this may bo of help to some 
readers. 

8. 'riie tnode is tho charactoristic that occurs most iro- 
quently, in other words, it. is tho position of the maximum 
ordinate. We cannot toll from the rough statistics which 
ordinate is greatest and tho mode can therefore only bo 
determined approximately until the law connecting tho 
various groups, i.e. the froquoncy-curvo, is known 

9. Now since an equation or curve might be used for several 
distributions, one given according to age, a second of a different 
subject according to duration, a third according to sums 
assured, and so on, we must have a standard of reference based 

( 8 ) 



on tlie distribution itself. For this purpose^^a function known 
as the standard deviation is%used It is given by 

n\ fa +/& + '^fn ) 

where a', b\ n’ are the djstances from the mean In the 
form of integr^s the standard deviation is 

where x is measured^from th^mgan 

The standard deviation measures the way the frequencies 
are distributed in terms of the umt of measurement As the 
frequencies farthest from the mean are multiphed by the 
largest values of a large standard deviation shows that the 
frequency distribution spreads out from the mean, while a 
small standard deviation shows that the frequency is closely 
concentrated about the mean In considering the relative 
sizes of standard deviations, it is necessary to bear in mind 
the umt of measurement, because, if a given distribution is 
arranged in two series, first, according to years of age, and 
then m quinquennial age groups, the standard deviation will 
be five times as large in the latter case as it is in the former 
This can be seen at once by comparing the two expressions 

and 

The latter is obviously five times the former The values of 
the standard deviations are given m Table I for each case 
The diagram on p. 11 shows two curves having the same 
mean B and approximately the same area, but the dotted 
curve has the larger standard deviation because it spreads 
out more on each side of the mean. 

The reader wiU notice from the algebraic expressions given 
above that the mean, mode and standard deviation are not 
dependent on the number of cases (i e on the absolute size 
of the curve), but merely on the way they are distributed 

( 9 ) 



(i e. on the proporfconate numbers or the shape of the cufvo). 
The standard deviation measures l^ie ^'spread'’ or ‘^'scatter’’ 
of the Statistics from the mean 

10 . An examination of frequency distributions (sec Table T 
and pp. 5 and (>) sliows that inos,t of them start at zoioj gra- 
dually rise to a maximum, and i^lien fall someijimes at a very 
different rate If the rise and fall are at tlie same rate, distribu- . 
tion jyill bo symmetrical about the moan, wlu(b must then 
coincide with the mode The differen'ce between the mean and 
mode IS therefore a function of the siceione/s or deviation from 
symmetry In order to get a satisfactory measure, the spread 
of the material must be taken into account, and this leads 
us to measure skewness by the distance between mean and 
mode divided by standard deviation If the mean is on the 
left-hand side of the mode when the statistics are plotted out 
in diagram, this fun(*tion will bo negative, and to rorTicmbor 
the sign it is (^onvonic'mt to wrii.o: 

,,, Mean — Mode 

n ko w n css = u i ^ 

n 11. 

The diagram on p. 1 1 will help to show the rationah^ of the 
measure for skewness. It gives two (airves having the same 
moan B and the same mode A, but witli dilferent standard 
deviations, and it is clear tliat ibe doiti^.d (uirve, with its larger 
standard deviation, is more nearly Hyinm(da‘i(*a.l than ibe other 
curve. 

11 . We may summarise these functions by saying that the 
mean and mode fix the position of the curve on the axis; the 
standard deviation shows how the material is distinhuied aI)out 
the mean, and the skewness shows the amount of the <I(wiatiou 
from symmetry exhibited by the material. 

These preliminary definitions will be sufficient for our 
present purpose, but the functions defined will bo more easily 
understood when their actual connection with the pracjtical 
work of curve-fitting has been studied A student working 
at the subject for the first time should plot out several distribu- 

( xo ) 



mons on cross-ruiea paxjer, in order to famifiarise himself with 
their nature and appearaniee He should calculate and insert 
/ the means in the'diagrams, but should not attempt to caf culate 
standard deviations until he knows somethmg of the method 
of moments. 



12 , Up to this point we have defined our statistics as 
frequencies, that is, as a number of cases grouped together 
as alike either because they are actually alike in the sense of 
Example V or because the statistics throw them up in com- 
paratively narrow groupings as in Examples II, III and IV 
When, however, we are tabulating our experience we have to 
deal with individual observations and they are grouped sub- 
sequently From this point of view if there are N observations 
we may call them o^, Og, O 3 , . where may stand for 

the first observation and may be (see Example III) one of 
the 200 existing in the 65-69 group It might be a case 
''existing'’ at age 66 * 12 , and Og might be "existing” at age 
73*72, O 3 at 42 26, O 4 at 67*37 and so on Then the mean is 



CHAPTER III 


METHOD OP^MOMENTH" 

1 . before we proceed to deal with smtablo forms for xiso as 
frequency-curves, it will be well to see if s^mo method of 
applying them to statistical examples cw be found, for it is 
clearly useless to suggest a cmve and have no way of using it 
We require, therefore, a general method by which a given 
formula can be fitted to a particular statistical experience, 
and may be applied to any expression (for instance, Makeham’s 
formula for the force of mortality) on which wo may have 
decided as the basis of graduation. Hrst point to be noticed 

in searching for a method is that if tlioro are 7h constants in the 
formula, we must form n equations between the formula and 
the statistics Thus, if wo have throe terms, say, // ™ 20, 40, 
and 88, when x — 1,2, and 3 respectively, and wish to use the 
curve y = a + bx + cx^ to describe them, wo can, of course, 
find values of a, b and c so that each item is exactly reproduced 
by equating as follows: 

a+ b+ c = 20 
a + 26 + 2% 40 

a + 3b + Pc ~ 88 

But if we have a fourth term y = 96 when x — 4, and use the 
values of a, 6, and c found from the throe equations just given, 
we should find that when x — 4, y = 164. This suggests that 
when there are more terms in the statistics than there arc 
constants, the equations must be formed by using all the terms, 
not by selecting from them. The graduating curve will not 
necessarily reproduce exactly any of the observations, but 
will run evenly through the roughnesses of the observed facts 
so as to represent their general trend. 

( 12 ) 



2 s Let , ^2 > ^3 5 • • Ae terms to be graduated, tlien, if 
the series were perfectly ^mooth and followed a known law, 
each term could be reproduced exactly by, say, 6i, b2fb^, 6^, 
where % = bi, == % = 63, . . and a^ = b^. Now, if we 

consider the two series (the a’s and the 6’§), we see that since 
each term is reproduced exac^tly 

n n n n 

2 = 2 2 == 2 

r=l T=X r=l 

where is a numerical coefficient. 

‘ 

This suggests a ]^ssible nsetbod to apply when each term 
cannot be reproduced exactly The total of the graduated 
figures must be made equal to the total of the ungraduated, 
and the further equations necessary for findmg the unknown 
constants must be formed by multiplying the various terms 
by different factors and similarly equating the sums of the 
graduated and ungraduated products, 1 e = Zc^h^. It 

still remains to decide the best form to be given to c^, and the 
mean bemg equal to 

ai + 2a2+ 

%+a2+ 

suggests that c^ = r should give one reasonable equation. 
Again, since we shall have to use some function of r which, 
when apphed to the graduation formula, will give an mtegrable 
form (otherwise we cannot make an equation between Zcj.b^ 
and So^a^), the powers of r suggest themselves as convement 
when integration by parts is attempted. If, therefore, we write 
Cy == and give t successively the values 0, 1, 2, . , we can 
obtain as many equations as we require, and from the first 
two of them we find, successively, the area and the mean, 
which will be the same in the graduated and ungraduated 
figures. 

This method is known as the Method of Moments (cf, 
moments of inertia), and experience has shown that it is a 
satisfactory method of fittmg a curve to an actual statistical 
experience Confirmation on the theoretical side has been 

( 13 ) 



produced, and whfie it is possible /o invent other methods of 
fitting particular curves, as has |j)een done by actuaries in 
connection with Makeham’s law of mortality, no better general 
method has been produced (see Appendix V, for note on 
other methods). ^ f 

3. Applying the method to sq^lve the thi'ce e^cpiations given 
^ above, we Jiave 

{ q * 6 H” c) 4" "f” + 2("c) + "f* 36 3"c) = 20 + *f 0 + S8 

(a + 6 + c) + 2(a + 26 + 22c) -h 3(a + 36 + 3^0) - 20^+ 2 x 40 + 3 x 88 
(c!^ 4” 6 4“ c) 4“ 22(<2 4" 26 Hh 2^0) 4~ 4*^36 4~ 3"c) 20 4~ 2" x 40 4” 3^ x 8{; 

or 3^4- 664- 14c =148 

ea-hUb + 36c = 364 
14a 4- 366 + 98c = 972 

These eciuations will give ihe same result as those from which 
they wore fotmed, becuiusc each ol' the throe terms 'can be 
graduated exactly, but if we nd-rodu(^(5 the fourth tertn, 
iT = 4, y = 96, we can modily the momeid- Uiothod by adding 
a fouxth term to each oipiation giv(m above and obtain 

4a + 106+ 30c = 244 
10a+ 3064- 100c 748 

30a 4“ 1006 + 3540 ^ 2508 

The solution of these equations gives 

a = - 23 0 
6 = 424) 

c = - 3*0 


or 

a; = 1 

y r= Ki-O 


a; = 2 

y ^ 50-2 


re = 3 

?/ 77 8 


re = 4 

y = !)<)-4 


This is a very simple example, but it will probably help to show 
the way results are reached, and will serve a,s a foundation 
for what follows 


( 14 ) 



4. « The nth. moment of i particular freqmency is defined as 

the product of the frequency and the nth power of the distance 
of the frequency from the vertical about which momehts are 
being taken, or the ^th moment of m:yordinatey of afrequency- 
curve about the vertical through a point (^stance x from it, is 
yx^\ and the ?^th moment of ^he whole distribution treated as 
a series of ordinates is 2/i^i + 2 / 2^2 + » where 2 / 1 + 2 / 2 + • 

the total frequency Thus, in Example V, the third moment 
of the frequency 40 for term 7 about the vertical through 
3 IS 40 X ( + 4)3. ^ m 

5. If the ordinates are kno'^, we can calculate the moment 
for them immediately by multiplying the frequencies by the 
powers of the distances between them and the vertical about 
which the moments are required and then addmg the results, 
care being taken to give the distances their proper signs If 
areas a?e given, an approximation is made by assuming them 
to be concentrated about the ordinates at the middle points 
of the bases on which they stand The columns after the third 
m Table II show the calculation of the moments about the 
vertical through age 77 for Example IV of Table I, on the 
assumption that the frequencies are concentrated at the middle 
points of the bases. 

The unit of grouping has been taken as 5 years, and if, as is 
often convenient, we assume the total frequency to be unity, 
the totals will have to be divided by 1000 We should generally 
deal with the actual numbers that occur, but as they have 
been given in Table I as the distribution of 1000 cases, it will 
be better to use them in that way in the present case The 
numbers — 4, — 3, in col (3) show the distances from age 77 
in terms of the unit of grouping The centre of any other group 
would have done almost as well as 77 , it is convenient to choose 
the arbitrary origin so that it is near the mean of the distribu- 
tion. This makes easier the calculation of the moments about 
the mean (a result frequently required), and enables the calcu- 
lator to get a rough check on these moments by comparing 
them with those about the arbitrary origin The cols. (4)-(7) 

( 15 ) 



are sufficiently eisplained by thei/ffieadmgs; they are formed 
successively and checked by multiplying / by s\ the values 
of s* 15eing taken from a table of the powers of the natural 
numbers 


Table^ II 


Central 
ago of 
gioup 

Trequoney 

/ 

(x-77)/5 

—a 

f\s 

/ X s- 

/ X .s'* 

/x.s'i 

(1) 

(2) 

(3) 

(4) 

' (■'!) 

'*{0) 

(7) 

57 

29 


116r 

404 

1,856 

7,424 

62 

23 

-3 

69 

207 

621 

1,863 

67 

81 

-2 

162 

324 

648 

1,296 

72 

151 

-1 

151 

151 

151 

151 

77 

192 

0 

-498 


-3,276 


82 

239 

1 

239 

239 

239 

239 

87 

157 

2 

314 

028 

U256 

2,512 

92 

93 

d 

270 

8.37 

2,511 

7,533 

97 

29 

t 

116 

461 

],K50 

7,424 

102 

6 

5 

30 

150 

700 

3,750 

Totals 

1,000 


J 978 

1 480 

3,461- 

t- 0,012 

32,192 



JMotation foh Moments 




jV “total fre< plenty 

?ith unadjusted statiHtit*al moment about mean. 



?/th umwljusted statistieal moment about any other point. 

/Uyj - /ith moment from curve about mean 


wth adjusted statistical moment about mean. 


moment trom ourvo about otlu^r point 


adjusted Htatastieal moimmt aliout othm* jioint. 

NOTii) Vi v% f .1 and yf always refer to a total luHjueney of unity. 


The arithmetical work may bo checked in other ways; for 
instance, instead of checking the final column by multiplying 
each term by the appropriate value of cb*, wo can form a new 
column (£B+ 1)*/, which is the same thing as 

+ 4a:®/ + 6a:®/ + 4a:/ +/ 

The total of this new column can therefore be used to give a 
check on the multiplication and addition. In the numerical 

( i6 ) 



example (Table II) we sdbuld have 29 x ( — 3)^, 23 x ( — 2)^, 
etc 5 6 X 6^, the total of su^h a column is 69,240, which agrees 
. with the totals ef cols. (2)-(7) in the following way 

32,192 
4x3,326 = 13,344 
6 X 3,46*4 = 20,784 
4x 480 = 1,920 
1,000 
69,240 


Helpful tables (Pothers and JFourth moments) will be found 
in Tables for Statisticians and Biometncians edited by K. 
Pearson (Cambridge Umversity Press) I shall in future refer 
to this book as Tables for Statisticians A student can manage 
without these volumes, but at some expense of trouble 

6. I^ has so far been assumed that moments can be calcu- 
lated about any point, but it is frequently mconvement to do 
so, for if we had required them about age 79 4, we should 
have had to multiply by the powers of (57 — 79 4)/5, of 
(62 — 79-4)/5 and so on, and it is quite clear that the labour 
would have been very great In such a case we can, however, 
take the moments about any other more convenient point, 
and then modify them in the following way 

Let the distance between A, about which the moments are 
known, and B, about which they are required, be 4-d, thus, 
if we want moments about 25 7 and have found them about 25, 
c? IS *7 , if we had found them about 26, d would have been — 3 
Then, if the distance of any ordinate from A is and from 


B is then 


- d and x!}: = {X^ - 


Now, the ^th moment of the whole distribution treated as a 
series of ordinates is Xy^.X'^ about J., and Xyj.Xy about J5, so we 


have 


K = 


= - ndX^--^ + . + ( - 1)«(Z»}] 




■dK-.- 


■ ( 1 ) 


( 17 ) 



where is written for the ^th m()ment about S, and the 
9^th moment about A. f 

Instead of (1) we may proceed as follows: ^ 

= ( 2 ) 

o ' 

There is little to choose between these two formulae, and of 

/r 

course they give identical results 

7. We will now apply them to work out the moments about 
the centroid vertical (i e. vertical through the mean) for the 
example in Table II The distance of the mean from any point is 

^(A ^(A,^?/^) 

where N is the total frc(picn(y , or wo may say that the distance 
of the mean from any point is the lirst moment; of the distribu- 
tion about the vortical through that [)oint It follows that the 
first moment about the centroid vcrticjal is zero, so that if such 
moments are roiiuired the term involving vl m (2) is zero. 
When we deal with Irequoncy-cmrvos wo shall see that wo 
generally require moments about the (tcntroid vertical and in 
designating them wo shall leave out the dashes and use r. 

8. The arithmetical work is as follows: 

The totals in cols. (4)*-(7) arc divided by tlio number of 
observations (total of col (2)), and the (juotients arc the 
moments (y') about 77. The moments are dealt with as having 
reference to a case whore unity is the toi^al fro<iuoiK‘y, i.e 
proportional, not actual, frequencies are dealt with. 

-480 3-464 

= 3 336 ^4=== 32*192 

The value of v'-^ gives the mean age = 77 -f 5 x *480 = 79*4. In 
order to use formula (1) or (2), the value of d is required, and 
when the calculation of moments has to be made about the 

( x8 ) 



centroid vertical its valuers, as we have seen above, the*same 
as v [ , in the present case it the first moment about the vertical 
through age 71 The powers of d are nQxt calculated by 
logarithms. As it happens c? is a comparatively simple number, 
if it Jiad been *48327, say, the propriety j^f usmg logarithms 

would have been more obvious 

% • 

= -2304 # = + .110692 = -0630842 * 

Remembering that v-^ is zero and Vq and Vq are each umty 

because they ai^ the total frequency, we reach 

•> ^ 

Va = 3 2336 = - f*430976 = 30*416289 

It IS wise to work to a large number of decimal places 
because, owing to the subtractions involved, calculations 
which began with, say, seven figures may end with only five 
It IS well, therefore, to use a seven-place logarithm table 
(e g Chambers’s) and antilogarithm table (e g. Fihpowski’s) 
or a multiplymg machine 

It will be noticed that the terms required in the calculation 
of successive moments can be formed continuously. Thus, m 
formula ( 1 ), we require to calculate the following multiples 

^ ^ ' 1 for the second moment 

3d ,, third ,, 

•^(^2 „ fourth 

I have found it convenient to adopt a regular system in 
calculating moments, as in other statistical work, and create 
the habit of putting results and calculations in fixed positions, 
so that the arithmetic, which is sometimes comphcated, can be 
followed quickly and can be confirmed or rectified more easily ^ 

9. Although the above is the usual way to calculate 
moments, another method was suggested by the late Sir G. F 
Hardy and used by him in his graduation of the British Offices 
Tables 1863-93 He pointed out that by summing the statis- 
tical numbers and forming a new series m the same way as 

( 19 ) 22 



is dcGie by actuaries when the N^, ypolumn is formed fron? the 
column and then summing these results (of the S column), 
and so on, equations can be formed which givqthe same results 
as the method of moments. The arrangement in Table lU * 
shows both the method of calculation and tlie form qI tlie 
expression obtained by the process 

Considering the line opposite the first term, Vo noiace tliat 
the sum of the series is given, and that the second summation, 
which we will call when the total frequency is taken as unity, 
gives the first moment of the whole distribution about a ver- 
tical situated at unit distance befoic the point corresponding 
to /(I). Still considering only the first line, we see that 
gives each function multiplied by coefficients of the form 

n{n+l) 71^ + n x / -x. r 

— Qj. 1 e it gives — - - where v is written for 

the moment, because by dolimtion the Mx moment (rj) of the 
whole (listnhuiion is given by the sum ol' for all values 

of 7?/ and 1 % give ea(^h fuiudion muliif)lied by 

. + 1 1 7^/*^ -{ (>7^ , , 

and respectively. 

The following equations result: 

- p[ aVj - + 

- 2 4 Oi b ■+ 1 1 "b 

Those eijuatious enable us to eakmiate the moments about 
the selected origin, but if it is tiocessary to find moments about 
the mean, the lollowing relations are more convenient, they 
can bo reached by substiintmg in the a.bovo the values m 
formula (2), and remembering that aS'jj nr. d. 

^2 = 2aS^3 — I + ( 1 ) 

^3 = — 3r2(lH-ri)---6Z(l-4dI)(2 + ri) 

^4 = 24S'g — 2^31^2(1 4" d) Hh 1} — ^ “b (2 H- f'i/) I]* 

-d{l^d)i2 + d){S + d) 
10 . Table IV shows the working m the numerical example 
already dealt with by the direct method The fifth sum is 

( 20 ) 





unnecessary, as the total (|p the items in the fourth sum gives 
the only value required 


Table IV 


Erequencjilb 

First ' 
sum 

^ Second 
sum 

Third 

sum 

Fouith 

sum 

29 

1,000 

5,480 

19,372 

54,508 

23 

971 

4,480 

13,892 

35,136 


948 

3,509 

9,412 

21,244 

151 ^ 

867 

2,561 

5,903 

11,832 

192 

716 

1,694 

3,342 

5,929 

239 

524 

• 978 

1,648 

2,587 

157 

285 

454 

670 

939 

93 

128 

169 

216 

269 

29 

35 

41 

47 

53 

6 

6 

6 

6 

6 

(for check) 

5,480 

19,372 

54,508 

132,503 


From the totals of the columns we have 

= d = 5-48 ^3 = 19 372 = 54*508 and = 132-503 

The first value S 2 or d shows that the mean is at age 
52 + 5 48x5 = 794 The age 52 is used because it is the centre 
of the group before that in which numbers occur, and, as has 
been abeady remarked, the summation method assumes the 
work to be done with reference to this position The apphcation 
of the formula for ^3 ^ 4 ? given above, enables us to find 

V 2 = 3-2336 = - 1*43099 = 30-4164 

11 . We may save arithmetical work in several ways when 
using the summation method If, instead of making all the 
calculations implied in Table III we stop at the sums next 
above the lines ruled in the various columns we shall have as 
the final totals 

Ufin), Snfin), f{n) 


and 


^ nin-l){n-2) 


( 31 ) 




V r 

The formulae of § 9 for 8^, 8^ ancl/Ss m terms of v[, v!^ etc. 
require modification only by altering the alternate signs from 
+ to - The form of moment given in this paragraph (§11) 
has been called “factorial moments” * A little further saving 
of work can be effected by taking the figures iqi to the totals 
t next below the lines ruled in liable Til This five's the same 
result as that just obtained if the origin is assumed to bo shifted 
one space. It will suffice if we i.ake this second ease for a 
numerical example and using the figures ^rora Table IV, 
we should work only so far a,s 4480 fdf the second sum, 
9412 for the third, 11,832 for the fourth and for the fifth 
we sum the last six entries in the fourth column and obtain 
9783 

12. These are the direct ways of using the summation 
method, but, as in the imililplicattion method of calculating 
moments, we can shorten ilie work by using a central term 
instead of the first term as llie starting point or arbitrary 
ongin.f We shall nowuse tliis arrangement with ilie“ factorial 
moment” form. A little care is necessary, bei'ause, though 
there is no difficulty about tho ini.erpretation of the sums on 
the positive side of the selected point, the moments for the 
terms on tho negative side assume tiiat multiplicai.ions are 
made by tho powers of negative quantities. Table IV (A) 
gives an example of Bummations tliat have to bo made. I’lio 
figure 978 is S«/(«) for values on the positive side of tho 
arbitrary origin and 498 is the similar sum on the negative 
side, Ignoring sign, say — Tho mean is found by 
dividing the diflorenco, !!«/(») — llw/( — m), by tho total 
frequency, i.o. (978 -498)/ 1 000 = -48 Taking, now, tho final 
figures in the columns headed “Third sum”, 070 nqirosents 


* The senn-mvanantfl (or half-mvarmntH) uh<hI by Thiele and other wnterfl 
can be obtained from momontH Tho second and third semi-mvanants are the 
same as the second and third moments about tho mean and the fourth somi- 
invanant is the fourth moment loss three times tho square of the second moment 

I I have to thank Mr G, J Lidstono for tho suggestion of shortened sum- 
mations 


( 22 ) 



S corresponding figure on the 

negative side , Similarly with the other columns 

Now reverting to the expressions m §11, which relate only 
to positive summations, we can write ^ 

' v^ = 2 ^;'+^' 

v’s = 6S',+ 6S'^ + 8'^ 

. v'i = 24:81 + 36;Sf; + U8^ + 8'^ 
x 

where S' is used ftistead of ^ to indicate^the different system 
of summation. 

In Table IV (A) however we have divided the distribution 
into two parts, and m applying the expressions ]ust given we 
must work out each part separately, adding the items for even 
moments and subtractmg for odd moments Hence for the 
whole distribution 

= (2x-670 + -978) + (2x*324 + -498) = 3-464 

= (6 X -269 + 6 X -670 + -978) - (6 x *139 + 6 x -324 + -498) 

= 3-336 

1 /^ = (24 X -059 + 36 X -269+ 14 X -670 + -978) 

+ (24 X -029 + 36 X -139 + 14 X -324 + -498) = 32-194 

and, transferring to the mean, 

= 3-2336, V 3 = ~ 1-43098 and V 4 = 30-41626 

We may now express the work in symbols. Writing P for 
summations on the positive side and N for those on the 
negative side of the arbitrary origin, we have 

l/g = ( 2 P 34 -P 2 ) + + 

= 2 (P 3 + -Vg) + (P 2 + iV'g) 

and similarly 

j /4 = 24:(P^ + N^) + ^Q{P^ + N^)-\-14:{P^ + Nq)-¥{P 2 + -^2)- 

( 23 ) 




13. A comparison of Tabic TV (A) with Table IV will show 
tliat a saving of numencal work is effected by using a (‘cntral 
point as the starting ])oint for the summat ion, for the sums arc 
numencaby smaller and the value olbSi or d, whicb (udeus int-o 
the tormulao on p. 2t), is mu(*b smaller. It will be readily 
apiireciiated that whenever there is a large number of kuans ibe 
summation method, Uiiid esiieeiuJIy tbe form of it given in 
Table IV (A), is an improvement on the prodm;!. nudbod of 
calculating moments. By means of an adding maibme the 
summations can be obtained mocbanicaJly with little t rouble, 
oven for series containing as many as a hundred terms. 

14. In § 12 oi (lliapter II the mean was desm'ibed alter- 
natively in terms of the individual observations, Oj. 

Similarly the tth momentj is 

and the ^th factorial moment is 

^ lo^(0 | - 1 ) . . . (Oi - « + 1 ) = ^ o'? 

/ 15. It is now necessary to consider the oakailation of 
moments from the curve, for until this has been done it is 
impossible to form equations for finding the constants. 

( 24 ) 




Le*b ci, b, c, ) 5 ,iiwhere a, b, o, . are constants to 

be determined 

, We have seen? on pp. 13 and 14, that one way of working 
would be to find 

f{l,a,b,c, ) X P'+/(2, a, b, c, x 2» + 
say, 2 «> ) X a:™ 

X=1 

and this would give a result which might be used m forming 
equations if it weis^not for the fact that it js often impossible 
to find an algebraic expression for the sum of such a senes in 
terms of the constants It is, however, generally possible to 
find such an expression for the integral, and as we have defined 
the w.th moment of an ordinate as y^x''\ the rath moment of 
the whole distribution from x = hto x = Jc is 

f y^x‘‘^dx or f f(x, a, b, c, . .)x‘^dx 
J h J h 

The total Irequency (i c total number of cases investigated) 

is {' y,dx, and iho mean is { y^xdxj f y^dx, as we have 
J li J h I J li 

already noticed. 

S'' 16. If the monieni/H from the equation to the curve are 

calculated in tins way and ciiuafcd to the moments calculated 
from statistics by assuming that the latter consist of a senes 
of ordinates, an inaccuracy is introduced. 

Let us consider the two cases 

(L) When the statistics arc a system of isolated terms or 
ordinates* and we wisli to iiass a curve very closely 
through them 

(2) Wlion tiiey arc a system of areas but the moments are 
calculated by assuming the areas to be concentrated at 
the middle points ot the bases. 

* Strictly siioakiiip;, not a frequency distribution but a senes of values 
requiring graduation nistnbutious have geneially to be dealt with as areas 
foi frequency- curve work because they tell the way the whole number of cases 
is divided in groups, and the wdiole area between the cuive and the axis of x 
must therefoie bo used 


( 25 ) 



17. In case (1) above, the term^^/o’ 2/2? Vn-i ^iven 

ri ^ 

by the statistics^ and since J is appifoximately equal 


to ?/q, it IS simplest* to assume thai 


f/j.dx is given \)y the 


equation to the curve, and we have to find ^adjustments to 

n - l 

counteract the cri-ort caused by equating T Xij^ to 

. X- 0 

Xy^dx The most practical way of ov^coming the diffi- 

J-i ' ^ 

culty IS by calculating the true area corresponding to the 
ordinates i/q, Vn-i means of a quadrature formula 

(formula of approximate summation) Many formulae are 
well known, but for the present purpose it is convenient to 
have ox])rc8Sions winch give ajiproximate values of an area 
in terms of ordinates lying both within and without tsho base 
on which the area to be valued staaids tSytnbolically, these 

formulae express J y^dx. in terms ol’ y j, ?/j, y .jj, y^^, etc , 

or ?/o. ?/i. ?/-!> 2/2. 2 / -2. ote 


I Let 

2/,n = « + hx. -I- f d»' + c.b'* 

rh c c 

then J = «+ ,2 + ^0 

and y^ = a 

V-i + Vi - 2(« + c + c) 

2/- 2 + 2/2 == 2(ffiq-4cH- 16c) 

Now, assume the required integral can be equated to 
hijQ + /r( 2 /-i + 2 /i) + Ky-2 + 2/2) 

substitute the values given just above and equate the coef- 

* It IS generally possible to use those limits in case (1), but if other limits 
have to be taken, such as 0 to ciitTorent quadrature formulae must bo used 
t Actuarial readers will notice that the error is analogous to that introduced 
by assuming (1 


( 26 ) 



ficients of a, c and e respectively to 1 , ^ and and we 
have 

Jh ^Jc -f- 2il =1 • 

2k + Sl=^-^ 

2/c4-32Z=sV ' 

The solution of these equations gives 

_ 5178 h_ 308 7— 17 

^^■” 6760 5 — Woo 6 — — FV60 , 

and we obtain ^ 

ri * 

J = 57^{®1'^S2/o+30S(2/-1 + 2/i)“ 1'^(2/-2 + 2/2)} ■• (I) 

II. If 


= a + bx + cx^ + dx^ 

= -2S i2/-i + 22.?/o + 2/i} . . - . (II) 

III. If 

= a + bx-h cx^ + dx^ + ex^ 

J = -ftVo{«02(2/i + 2/_i) - 93(yij + ?/_ij)+ 11(2/2 j+?/_2^)} 

.. (in) 

IV If 

2 /^ = a + 6a; + + dx^ 

J = ^^tf{272/o+ 172/1 + 52 / 2 - 2 / 3 } •(I"'^) 

18. We can now take the calculation of the moments, where 
rn-i 

yj^dx IS required in terms of 2 / 0 , 2 / 1 , . y^_-^ 


Now 


rn-i ri /-li rn-i 

y^dx= yxdx+\ yxdx+. + y^dx 
J-i J -i Ji Jn-U 


If formula (I) be applied it can be used for aU the integrals 
on the right-hand side of this equation except the first two 
and the last two, and the values of these are given by (IV). 

( 27 ) 



Summing the values obtained an(^ writing (IV) with the de- 
nominator 6760, we obtain 

, , f 

(*n—l 

- TO + ^37 l//i + 06G9y/, + 5537?/, 

+ 5760(^7^ "1- ?/5 + + + ?//, 5) + 5537?/^, 

+ (MH%,„,+ 4371?/,„o + (U(%^, J .,(V) 

which means tliat we can multiply 

the first and last ordinates by = ^1-1 220486), 

the second an^i last but one by *7588542), 

the third and last but twodiy ffo'(T(= 1*1578125), 
the fourth and last but three by f|-|§(=== *9612847), 

leave all the other ordinates unaltered, and work out the 
moments in the usual way from this modified scries of 
ordinates. If there are less tluin eight ordinates, another 
formula must bo evolved. 

19 . In the following table the original sen(\s and the modi- 
fied one are set out m Ibo first two (*olumns, Jind in the other 
columns the calculations of the lirsi. four moments about the 
middle of the range l)y the dirccit method are shown: 


Tabok V 



IVIodiiuMl by 
lurmula (V) 
?// 





?// 

riBSi 

68 n 

1 

2.32 62 

!).30 08 

3,720 .32 

14,881-28 

43 74 

:i:i 19 

-3 

99 67 

288-71 

866 13 

2,698 39 

35 42 

41 01 

- 2 

82-02 

Htl 01 

;t28-(>H 

666-16 

27 80 

26 72 

1 

iii\ 72 

26 72 

21> 72 

26 72 

20 42 

20 42 


110 83 


eon 26 

,, 

13 79 

13 26 

"} 1 

13 26 i 

13 26 

I3-2() 

13 26 

8 22 

9-62 

4 2 

19-01 1 

38 08 

7(5 16 

162-:i2 

4 29 

3 26 

4 3 

9 78 

29 3 i 

88 02 

264-06 

1 69 

14)0 

4" 4 

7 6)0 

.30-40 

121-60 

486-40 

207 18 

207 41 


4" 49 68 
~:i9ri6 

1,620 63 

f 299 04 
4,042-21 

19,078 69 


207*41 IS then treated as the total frequency, and the 
moments for unit frequency (//./J would bo obtained by dividing 

( 28 ) 



— 3Sn*15, 1520 63, etc by 207 41, and not by 207-18, which is 
not the ''total frequency’’, but merely gives the uncorrected 

. sum of certain equidistant values • 

^ 20 . The work can sometimes be simphfied considerably, 

for if lyhe values at the ends of the expenenci are very small and 
have a tendency to keep close to the axis of cc before they 
finally vanish (i e if there is high contact, most actuarial 
functions D^, etc have high contact at the old age end 
of the table), then it is reasonable to suppose that ordinates 
before the first anSL after the last exist, brft are msigmficant 
in value. Thus the integral corresponding to the whole series 
of ordinates can be legitimately extended beyond the limits 

— I and ^ — J previously used, because the additional area 
thus introduqed will be evanescent Now if the area be so 
extended, the effect will be that in equation (V) the sigmficant 

^ ordinates from to will all have the coefficient unity, 
and the ordinates with weighted coefficients will all vanish 

The practical result is, that if there is high contact at one 
end of the statistics the adjustment need only be made at the 
other end, while if there is high contact at both ends no adjust- 
ment IS necessary 

Mathematically, high contact means that the first few 
differential coefficients vanish at the point of contact. The 
diagrams on pp 71 and 83 show high contact at both ends of 
the curves, and the diagram on p. 63 shows high contact at 
the longer durations 

21 . The second case in § 16, namely, that in which mid- 
ordinates are used instead of areas, may now be examined. 
By concentrating areas about the middle point.^ of their 
bases, we assume that the distances by which the areas 
ri rik 

y^dx, must be multiplied are the same as 

J-i 

the distances from etc , that is, the Zth moment from the 
statistics is 

r+l ru rn~i 

y^dxX^+\ y^dx{X+lY+ . -f yy,dx{X + n—lY 
J -i J i J n-u 

( 29 ) 



and we require {X + xYy^dxl where X is the distance of 

yg from the ordinate about which moments to calculated. 

Applying formula (I) to each integral and collecting terms, ^ 
we reach as a gervsral coefficient 

wW {- + [5178A'+ 308[(A- 1 y+ (/i+ 1)'} 

-n{{h-2y+{h+2y}]y+ ..} 

where h is written for Z + a; for simphlication, or 

If i == 1 this becomes h 
„t = 2 „ 

„ i = 3 ,, + 

„t=4: ,, + '8 o' 

It has already been noticed ihat il there is high contact, the 
value of (X + u^Yf/dx is Ibuud by using tlic uiuuljustoci 

ordmaios, that is, the second niomoni is given by a senos, 
the general term of which is the thini by a sonosj the 
general term of which is /A/, and so on; honcjo, if/i bo written 
for the true adjusted moment about the moan and for the 
unadjusted moment, the relations boi^woon // and v are given by 

/*2 + i ’2- = ’'2 or 

Ih + hh + -80 == Vi or fii = iq - iiq + . 

The mean needs no adjustment, for if ^ = J the general term 
has the correct coefficient h, and tlie third moment has to bo 
adju8ted«by J of the first moment, which is zero where the 
moments are taken about the mean,* In order to demonstrate 
the correction for the nth moment by the above method, 
a parabola of at least the nth order is necessary If we apply 
these adjustments to the moments found on j) ii), for Example 
IV of Table I, we have yZg = 3*1503, = “1-430976 and 

* Those adjustmoats wore first given by W, F Sheppard m Proc Lond Math. 
Soc XXIX, 353-80 Be© also Appendix I 

( 3° ) 



=*28 82866 These adjustments are found to make an appre- 
ciable difference in the constants obtained from the moments, 
^ especially when there is a small number of terms 

In other words, they allow for the grouping, and the lesson 
to be learnt is that a moderate amount of g«oupmg saves work 
and, thanks to ^our knowledge of the correct adjustments, does 
not mtroduce error in the circumstances described 

22. The practical conclusions in the two preceding para- 
graphs as to the treatment of moments when there is high 
contact can be chewed numerically. The equation to a curve 
with high contact at each en(J having been written down, we 
can work out the ordmates at equidistant points or the areas 
on equal bases and calculate the moments from the figures. 
From the equation to the curve we can also calculate the area 
and moments for the whole curve and it will be found that the 
correspdndmg figures agree. A good curve with which to make 
experiments in this way is ‘'the normal curve of error” 
because the ordinates and areas are accm’ately tabulated in 
Tables for StaUsUcians, but anyone wishing to apply this sort 
of check IS advised to wait until he has read a httle about 
frequency-curves 

When there is not high contact at both ends of the curve, 
the adjustments become more difficult to value, suggestions 
have been made for finding the corrections, and this matter 
is further discussed in Appendix I, but a beginner is advised 
to avoid these refinements 

A student should calculate the moments for one or two 
distributions, and make the easier adjustments, he can also 
find the standard deviations of distributions, for the S.D. = 
where the been adjusted in accordance with the above 

rules In Examples III and IV of Table I there is clearly high 
contact, and in Example I the rough moment should be used. 
In Examples II and V there is more doubt, and in the calcula- 
tion of the moments for Example II (see p 60), no adjustment 
was made 

This advice is given not because adjustment is unnecessary, 

( 31 ) 



but because a beginner can content himself with mastering 
the general idea and leave out some of the refinements until 
he has a little moi'e experience Later on, whbn the methods of^ 
Appendix I aie examined, it will be seen that Rhcp])ard’s 
adjustments aloncr/lo not usually nujirove the rough mcTments 
when the distribution is abru])t ^ 

23. Before jiroceedmg to deal with fitting moio com plicated 
curves it is advisable to consider the application of tlie method 
of moments to a simple case, namely, when 

II — 


Let the range be 21, and let the origin be at the middle point 
of the range, and Mq stand foi the area and for the nth. 
moment of the whole distribution about the middle of the 


range. Then 


ni 

{a H- I)a: + ea*- -f . . . ) dx 

J ~i 

L.M-'i, + S + 


aiidsimiliirlyi»j,„ = 2* xP’‘ ' 5 + ■ ) 

These ecpiations show that the oven moments give the 
constants a, c, c, etc., and the odd moments give the constants 
b, d,f, et(^ This is, of course, the result of using moments about 
the middle of the range, and makes the solution ol' thecijuations 
less laborious than they would otherwise have been. The 
solution can also bo simplified a little by writing 

1 _ a 

2 i ' ~ 2s+i'^2.9+;5 

so that j ^^2 pii ^ 

Wo = a+ 3 + g + • 

1 m, a cP eP 
1 Wi a cP eP 

( 32 ) 



and* similarly 


1 m, bl dP fP 1 
21 I 3 5 7 . 

1 m, bl dP m 

j. ^ ^ f]L 

21 “ 7 + 9 +11 + 

J 

The solution^of these equations gives the constants required, 
for example, 

(i) if 2/ = + bx, we have 
1 


, 3 1 m. 

‘ = T 21-1 

^11 j ij y — a’^-bx^ cx^ 

3(3 5 

, 3 1 m. 

*'“( FiT 

15 (1 3 TO, 

^“4p( 2"r^”'®‘^2r 


(lii) if 2/ = a + 6a; + ca:^ + da:® 


a = 


3(3 


4 2/ 


TOn- 


O TO, 


_ 15 (^ 
“ 4/ ( 2 / 

15 


TO, 


TO, 


C = 


d = 


4/2 

4/® 




2V P 
7 

~2l 1^1 
3 m, 


21 V 


3 5 m 3 

d" 7 a 


21 I ^21 l\ 

The above results, which can easily be extended if it is 
wished, may now be applied to one or two numerical examples 

( 33 ) 


EFC 



24. As a first example, we shall graduate the statistics in 
Table V, § 19, for which the moments about the middle of the 
range have been calculated Taking the curve y = a-\-bx + cx^, 
the following values from Table V will be required 


21 = 9 

o 

II 

mo = 

207*41 

Ml = 

^ 391*15 

7yhi.y ~ 

1520 63 


Hence 


3 [622 23 5 4520 63 

4l 9 9^ (4 5)2 


4( 9 

= 20 563 
3 1 

“ 45 ^ 9 ^" 
= - 6 4387 


■391 15 
4 5 


15_ [_ 207-4 1 3 

4(4- 5)2 9 "^9^ 

•36815 


520-63 

( 4 - 5)2 


• 25. The best way to obtain the ordinates <!orroH[)onding 

to tins graduation is by calculating b + c ihe first difference, 
and 2o the second difference, from fbe middle form; their 
values are -6 0706 and 7363 respectively. Kince second 
differences are constant, the work is done continuously, and 
is as follows: . 


52-208 
43-192 
34-913 
27-370 
20 563 
14-492 
9-157 
4-558 
•696 


A 

-9-016 

-8-279 

-7-543 

-6-807 

-6-071 

-5-335 

-4-599 

-3-862 


( 34 ) 



T^bese graduated figures will be found to agree fairly well 
with those given in the first column of Table V. 

2 6 . As a further example the foUowmg statistics , taken from 
a paper by S H J W Alhn (J Inst Actu xxxix, 350), and 
givnfg the values of annuities to widovjs in pension funds 
according to the age of the member, may be considered: 


Age 

Value 

of 

annuity 

Modified 
♦ by 

formula (V) 

p. 28 
a' 

Distance 

from 

middle of 
range 
multiplied 
by 2^ 
d 

a' xd 

. a'xd‘^ 

a' X d^ 

27 

21-20 

23 79 

-7 

166 53 

1165 71 

8159 97 

32 

19 91 

15 11 

-5 

75 55 

377 75 

1888 75 

37 

19 34 

22 40 

-3 

67 20 

201 60 

604 80 

42 

18 58 

17 86 

-1 

17 86 

17 86 

17 86 





-327 14 


- 10671 38 

47 

16 74 

16 09 

+ 1 

16 09 

16 09 

16 09 

52 

15 69 

18 17 

+ 3 

54 51 

163 53 

490 59 

57 

14 70 

11 15 

+ 5 

55 75 

278 75 

1393 75 

62 

12 99 

14 58 

+ 7 

102 06 

714 42 

5000 94 



139 15 


+ 228 41 

2935 71 

+ 6901 37 





-98 73 


-3770 01 


In calculating the above moments it has been assumed that 
the figures to be graduated represent a system of ordinates, 
if they had represented a system of areas, the adjustment by 
formula (V) would have been unsuitable 

The alternative is to avoid the integral calculus and work 
out from the equation y = f{x) the sum of the ordinates and 
the moments of the ordinates. In the particular case where 
f(x) == a-\-bx-\-cx^-\- this is practicable, but there are many 
expressions which, with their moments, can be integrated but 
do not lend themselves to fimte summation We have therefore 
confined attention to the more general method. 

When there is an even number of terms the difficulty of 
calculating the moments about the middle of the range is that 

( 35 ) 


3*2 




the terms have to be multiplied by -5, 1-5, 2 5, etc., and if the 
series to be graduated contains only a few terms, it is best to 
deal with the distance d, in the way shown above, and then 
divide the totals by 2, 4 and 8, in order to obtain the first, 
second and third nroments res])ectively In tins way, we have 

Z = 4 

TOo= 13915 
= - 49-36 
= 733-93 

Wj = -471 25 

We will now fit the statistics with each of the three curves, 
the formulae for which have been given, and compare the 
resulting graduations 

(i) y = 17-394-1 157x 
(li) y= l7-()33-l-157;c--045l.^a 
(lii) y= 17 633- l-19().<•--045l.^•“^ -()036r> 

The following table shows the graduations- 


Age 

Ungrafluatod 

(>) 

(>•) 

(ti) ' 

21 

21 20 

21 -U 

21 13 

21 13 

32 

nm 

20 20 

20 24 

20 28 

37 

10 34 

10- ID 

10*27 

10 31 

42 

18 r>8 

17 07 

18 20 

18 22 

47 

16 14 

HI 82 

17 01 

17 02 

r>2 

15-60 

1/5 (id 

15*80 

15 76 

57 

14 70 

14 r.() 

M -16 

14 43 

62 

12 00 

1:1:1 1 

13 03 

13*05 


Formulae (li) and (iii) are practically identical, and both are 
considerably closer to the original figures than (i). 
v/ 27. The results obtained so far may bo summanaed as 
follows: 

(1) The method of moments is a general method of finding 
the constants in a formula suitable to a particular 
statistical example, and it consists of equating the values 
of Ilf{n) X n* (which is called the ith moment, and is 

( 36 ) 



summed for all values of n that occur) to similar ex- 
pressions obtained from the graduation formula These 
latter expressions will be algebraic^ ^nd simultaneous 
equations have to be solved m order to find the arith- 
metical constants ^ 

(2) The moments from the statistics can be calculated by 
multiplying the frequencies by appropriate values of nf 

or by the summation method « 

% 

(3) If moments diave been obtained about any one vertical, 
they can be transferred to any other by the formulae 
in § 6 of this chapter 

(4) Since the moments from the graduation formula must 
generally be found by means of the integral calculus, 
while those from the statistics are found by summation, 
the latter have to be adjusted before the equations for 
obtaining the constants can be correctly formed The 
adjustments depend on whether the statistics are a 
system of ordmates or a system of areas, in the former 
case adjustment is made by equation (V), and in the 
latter by the formulae in § 21 if there is high contact at 
both ends of the curve 



CHAPTER IV 


PEARSON’S SYSTEM 01) 
EREQUENCY-CURVIiS 

r 

1 . When it becomes necessary in practical work to decide 
on a system of curves for describing frequency distributions, 
we have to bear in mind that 

(1) Any cxprosKSion used must be a graduation formula, 
it must remove the roughnoss of tlio material, 

(2) There must not bo so many constants in the formula 
that wc I'cquirc a great number of moments, for this 
means that the accuracy is ro(lucc<l ''fhe higher the 
moment the more liable it is to error wlien deduced 
from ungrad uatod observations; tins is (ilear, when we 
remember that the ends of the exporionc^cs arc muli-iplied 
by the highest numbers and their ])owers, 

(3) There must bo a systematic motliod of ap])roaching 
frequency distributions, 

2. Now, considering the more obvious (diaractcristics of 

frequency distributions, we find they generally start at zero, 
rise to a maximum, and then fall sometimes at the same but 
often at a difierent rate At the ends of the distribution there 
is often high contact This means, mathematically, that a series 
of equations y ==/(^r), ?y = niust be chosen, so that 

in each equation of the series dy/dx = 0 for certain values of 
X, namely, at the maximum and at the end of the curve 
where there is contact with the axis of x, 

( 38 ) 



The above suggests that dyjdx may be put equal to ^ ' 

then, if y = 0, dyjdx == 0, and there is, th€ 5 j:efore, contact at 
one end of the curve, while if x = - a, dyjdx = 0, and we 
have'the maximum we require So long a^ F{x) is general the 
form assumed^ for dyjdx is extremely general and includes 
cases when dyjdx may not be zero when y is zero If F{x) is 
expanded by Maclaurin’s theorem in ascending powers of x, 
we have ^ 


.(I) 


We shall return to this equation and show how it can be put in 
the form y = f{x), so as to express i/ as a direct function of x, 
and we shall see that we have obtained something more general 
than i^ implied at the beginning of this paragraph We shall 
obtain curves taking various widely different shapes As the 
matter has up to the present been approached from an experi- 
mental point of view, it will be interesting to see how equation 
(I) can be obtained up to the x^ term m the denominator from 
elementary propositions in the theory of probabilities 
3. If p be the probability of an event happening and q the 
probability of its failing, then the probabilities of its happening 
once, twice, and so on out of n trMs are given by the terms of 
the expansion {'p-¥qY‘, or if we have N cases, the terms of 
give the frequency distribution of the N cases into 
+ 1 groups. The binomial series does not represent nearly all 
the probabilities that arise, and another series that occurs is the 
hypergeometrical Thus the chances of getting r, r — 1, ,0 

black balls from a bag contaimng pn black and qn white balls 
when r balls are drawn, are given by the successive terms of the 
series 


pn{pn— 1) . — 

n{n — l) .{n — r+l) 

^ rqn ^r(r— 1) qn{qn— 1) ^ 

pn — r+i 2^ (pn—r-{-l)(pn — r'j-2) 

( 39 ) 


X 



A numerical example may help to make clear the way the 
series arises . A bag contains seven balls, of which four are black 
and three white, then if three balls are drawn the probability 

that 4-^2 

All will bo black is 

/ . O t ) 

4.3 3 

Two will be black is — - x oft 
7 6 5 ^ 

One will be black is x 


None will be black is --- --- 


The sum of these four expressions is unity The terms can be 
seen to agree with the series by putting n — "I , p7h ^ 4^, qn ^ 3, 
and r = 3. ^ 

Other senes may arise, but those given will bo sufficient for 
the present ])urposc, and wo shall proceed to consider how they 
can be put in the form of equation (I). Idiohiconvcnionceof the 
expressions as they now stand beciomes fairly obvious when an 
attempt is made to calculate numerical values for a large num- 
ber of groups, and besides this, they are not continuous, while 
the statistics of practical work often are 

Considering the hyporgcometrical scries, and remembering 

that the function required for equation (f) is and that, 

as the series is discontinuous, finite differences must bo used, 
we have 


Vx ' 


pn{'pn—\),, (pn — r-f 1) r(f ~ 1) .(r“"a;+2) 
n{n—l) ..(y 2 /~-r+l) ’ (ir— 1)! 

qn{qn--\) , ((/9^•-v» + 2) 

— f-h i) (p??, — f + 2) . \pn-'r4-x—l) 


^Vx “ Vx+l Vx Vx 


Vx 


’t — X 4- \ qn — x 4-1 


-1 


X pn - T-^x 

for P4.g=l 

x{j}n-rJrx) I 
( 40 ) 



and 


Vx+i == Uvx+i+yx) 

, _ ( (r+ 1 ) {qn+ 1 ) — a;[2(r + l)*f -re(g— y)] + 2a;^| 

x{pn — r+x) j 

^Vx 2 {(r+ 1 ) (gra+ l)-x(n + 2 )} 

Vx+i (?’+l)(9'Ji+l)-a:[2(r+l) + «,(g'-jD)}+2a;2 

which may be put m the form of equation (I), 

' I dy _ a + x 

ydx bQ-^b-^x-\-b^x^ 

In this form the actuarial reader will naturally think of the 
force of mortahty to proceed from the force of mortahty, after 
changing its sign, to the ''number living” (Z^) in a hfe table is 
the same thing as to proceed from the formula just given to 
- a frequQAcy-curve. 

4 . Returning to equation (I), we see that it can be written 
m the form 


multiplying each side by and integrating with respect to x, 
we have 


(60 + 61 oj + 62 ^ = jyi^ + 


Integrate the left-hand side by parts treating dyjdx as one 
part, and the right-hand side as the sum of two functions, and 
then 

x '^^ (60 + 61a; + 62 + . )y - + (7^ + 1 ) bj^x'^^ 

+ (^1 + 2)620;’^+^+ ]ydx 

= jyx'^'^^dx + jyax'^dx 

or, if at the ends of the range of the curve the expression 
x'^{bQ + 61a; + b2X^ + ) y vamshes, we have 

— nbQjl'r^^i — (7^ + 1 ) b-yll'^^ — (tI + 2) 62 /t^ 4 .i . = l^n+x + 

( 41 ) 



r 

where we use the notation we have already adopted, namely, 

If we put n - 0,fl, 2, . s respectively, wc get s + 1 eqvfations 
to enable us to find a, b^, etc., in terms the moments 
(/{') as shown by the following equations, which have been 
obtained by writing the equation in the form 

r 

a//4 + + (^ + 1) + 2) + . = — 

and then putting n = 0, 1, 2, etc 

+ 0 X 

afi[+ 6oK + 26^/4 + 362yW2+- = -/^2 ^ 
afb' 2 . + 26o/ti + 36;i/4 + 462/4 + • • = ■-" /4 
tt/4 + 360//y2 + 463 /4 + ”h = — /4^ ^ 

etc 5 etc. 


Let us now make /4 == 0, and altci the other moments in 
the way indicated in Chapter II T, for the result of making 
= 0 IS to change the origin of the system to the mean of the 
distribution We can also treat /4 as 1, and those simplifica- 
tions lead to the following results: 

(1) Keeping 6^ only, we have 

1 dy X 

y dx m 

(2) Keeping and 6^, the first three equations in the 
system (II) above give 

= 0 

and a/ig + 



and 


^^2 


( 42 ) 



and the differential equation becomes 


1% 

ydx 


X+-P- 

.—ItL. 


(3) Keeping the system gives 


a + 6i = 0 


60+362/^2 — /^2 

% 

“t "t ‘^^2/^3 ” /^3 


a/63 + 36 o /^2 + 461/^3 + ^^2/^4 = - /^4 


The solution of these simultaneous equations is perfectly 
straightforward, and leads to 




y /■^2 (^/■^2M4 ~ 3/^3^) 

10^2^4 - 


10/142^.4- 18^2" -12iU3^ 

/■^3(^4 -^8/12^) , 2/X2^4 — 3/i.3^ — 6^2^ ^2 

10/X2^4 - " 12/i3" ^ 10AC2At4 - ISiLta" - 12 ^ 3 “ 


In this last form put y?i = ^ and /?2 = ^ 

/^2 /^2 


I V/^2VA(/^2 + ^) 

1#^ 2^yg,-6;gl-9) 

y dx - 3/?i) + V/^a VA(A 2 + 3) a; + ( 2 ;ff 2 - 3A - 6) a;2 


2(5/?2-6 /?i- 9) 


.. (Ill) 


5. The reasoning by which equation (I) was first obtamed 
showed that a is the distance between the origin and the mode, 
or as the origin has now been transferred to the mean by puttmg 
fi'^=z 0 , a IS the distance between the mean and the mode. 
This distance in terms of the moments is, therefore, 

^ VA(A+^) 

2(5A~6;5i~9) 

where cr is the standard deviation 

( 43 ) 



Since the skewness is the distance between the mean and 
mode divided by the standard deviation 


Skewness = 


VA(/? 2 + 3 ) 

2(5/?2-6A-9) 


* 


6 . It would be p^ossible to obtain constants ni^the differential 
equation (I) by using a greater number of terms and retaining 
63, 64, etc , but there arc strong practical objections to such 
a course Besides the increase in anthmeticaJ^ work, the gam 
m introducing additional constants is small because the higher 
moments become untrustworthy, as we have already noticed 
Karl Pearson has shown* that ^ Ve might easily on a random 
sample reach a 7 th or 8th moment having half or double the 
value it actually has in the general population Constants 
based on« these high moments will be juactically idle. They 
may enable us to dcsc^ribo closely an individual random;sample, 
but no safe argument can be drawn from this individual 
sample as to the general jiopulation ai/ large, a-i any rate so 
far as the argument is based on the consta-nts depending on 
these high moments ” In some a(*Xuarial statistics where tliere 
are as many as 100,000 cases, it might bo worth while to go 
as far as the next term of the scries, but even here the value 
of the work is discounted because any other smaller body of 
statistics on the same subject could not bo compared satis- 
factorily with the result* For f)raci^i(uil purposes it is probable 
that the equation taken as far as will be sufficient, and we 
shall confine our attention to the forms thus olitained. 

7 . Turning to the particular form of equation ( 1 ) given in 
equation (III) it will be seen that it is ])C)ssil)le to obtain a 
formula representing the statistics by inserting in that equa- 
tion the values of the moments found from the statistics, but 
this would not give a graduation in the same form as that in 
which the original data appeared, for in the latter we have y, 

while the former gives or It would, therefore, 

^ ydx dx 

* “Skew coirelation anrl non-lmear regrosBion”, Drapers' Company Res Mem 
1905, p 9 See also Chapter X, 


( 44 ) 



be necessary to integrate the expression we obtain m order to 
get terms comparable with the original data, and it is better in 
^ practical work to deal with the equations m the forms in which 
we require them for comparison, rather than by using the 
differential equation and then integrating tke result The latter 
method could t?nly give proportional not actual frequencies. 

8 . The next step is, therefore, to replace the equation 

• ^lo^ _ x + a 

dx bQ-\-b-^x-\~b^x^ 

X “f* €L 

by one of the form y = f{x), and to do this 7 ; ; — 5 must 

be integrated 

Let us consider equation (III) as a general expression for 
integration, then we notice that the form the integral takes 
depends on the particular values of the coefficients of a; m the 
denominator The problem is, in fact, merely a consideration 
of the forms taken by the denommator for 




x-- 


262 


-6,-V(6f-46o6,) - 

26, 


and the criterion for fixing the form m a particular case is, 
obviously, the same as that for the nature of the roots of the 
equation bQ + bf^x + b^x^ = 0, viz. b\l{4b^b^), which, by sub- 
stituting from formula (III), gives 


AiCA + S)^ 

4(2/?2-3A-6)(4A-3A) 


. .(IV) 


9. If expression (IV) is negative the roots are real and of 
different sign, and we get one of the mam types of curve — 
called Type I by Karl Pearson, to whom this system of curves 
is due , if expression (IV) is positive and less than umty the 
roots are complex, and we get the second mam type (Pearson’s 
Type IV), and if expression (IV) is positive and greater than 

( 45 ) 



unity the roots are real and of the same sign, and we reach the 
third mam type (Pearson’s Type VI) 

This really oon-otb the whole field, but m the limiting cases^ 
when one type changes into another we reach simpler forms of 
transition curves ^Thus when the criteiion is large (theoreti- 
cally infinite) one root is oo (Ty])e III), whewf it is unity the 
two roots are equal (Type V), and when it is zero the roots are 
equal in magnitude but of opposite sign (Type II) If in the 
lasf case 6^ = ^2 = we reach what we shall call the ‘'normal 
curve of error”, this name is open to some objection just as 
are the other names given & it (e g Probability curve, 
Gaussian curve, etc ) Then again the expression for {d log y)ldx 
may be reducible to the form a' l{bQ-+‘b[x) and we have a 
binomial or a straight line for the frequency-curve (cf Types 
VIII, IX and XI), while if the expression reduces to a constant 
the curve is the ordinary gcometiical pi egression vrhich we"^ 
are pleased to find as a special case oi a system of frequency- 
curves because wo are already familiar with it m the theory of 
probability in connection with sequences fiom coin tossing, 
etc As we proceed we shall find that m certain carinimstances 
the curves may be J-shaped or even U-shai)cd, with limits of 
a single ordinate or two separated ordinates A diagram at 
the end of the book will give the reader an idea of the variety 
of shapes taken by the curves evolved from the iormula 

dlogy _ x~ha 
dx ^ b^-^biXi^b^x^ 

In practice we shall require the equations to the various kinds 
of frequency-curve, and we shall also want to know which 
type should be used m a particular case. We cannot usually 
guess the type from the appearance of the rough data and 
need an arithmetical test 

10. We will deal first with the equations to the frequency- 
curves, that IS, with the actual integration, and begin with the 
three main types. 

F^rst Mmn Type {Pearson's Type 1) The factors in the 

( 46 ) 



/ 


denominator j when the roots of bQ-\-b-^x-\-h^x^ = () are real 
and of different signs, take the form 


> X — 

% 


— b-^ + ^a positive quantityl 


26 , 


J . 

•b^—^Jsb positive quantity”] 

w, J 

and the expression to be integrated is therefore of the forxa 




x + a 


1 A^ — a 


62 + A -^ (x — A ^ 62 -^1 + -^2 ^ "h A ^ 62 A -^ + A ^ X — A ^ 

by partial fractions 

The integration is now simple, and gives 

+ a constant 


1 An -j- d 


L i Aj+a 

y = 2/'(a; + Ax+Ai(^X’-A2)^^ Ai-j-Az 


where y' results from the constant introduced by integration 
If the origin is now transferred to the mode (1 e put x for 
x + a), we have 


2 / = 2/0 



where == mja^ 

Second Main Type {Pearson's Type IV) If the roots of the 
equation 6 o + ^i^ + ^2^^ “ ^ complex, it is impossible to 
throw the denominator into real factors, and when this occurs 
we have to mtegrate by putting the expression on the right- 
hand side of the fundamental differential equation in the form 


Z-l-c 

h{X^ + A^) 

. b. J ^ 9 ^0 ^1 

where X = x+^^, c = a-^^ and = 

( 47 ) 



c 


Then log^ 


X + c 


or 


b^{X^ + A^) 
X 


dX 
dX + 


=J. 

= ~ log(X®* + X®) + -YT- tan“ ^ -f ' constant 


y = 


Uo 


I /y»2\??l 


-vtm'^^xla 


where a has a meaning different from that implied m equation 
(I). The relation between this type and Type I can be seen by 
factorising the denominator of the right-hand side of the 
differential equation, 62(-X' — and then obtaining, 
an expression for y having the same form as Type I, but 
containing complex expressions 

Third Mam Tyya {Pearsons Type VI). The factorising is 
the same as Ty])c I, but the roots of the equation being of 
like sign, the factors of the denominator take the form 
{x -f Ai) {x + ^2) The work is tlicn the same, but at the end the 
origin is put by Pearson not at the mode but so that one of the 
expressions x + Ai or a;-l-.d2 can bo written as x. The form is 

11. We may now set out a few of the transition types 
Pearson’s Type //is the same as his Type 1 when == ^2 
Type III. This type is reached when the criterion is cxd, 
which happens when 62 = 0, 


logJ.Jjf 


x-\-a 


+ b^x 

Jbi h^+h) 

X I \ 1 

= ^ + ^ j ^ log(^i x + bQ) + constant 


( 48 ) 



and y = x + b^) V&iV&i 

jl or, by changing the origin, 


y = 2/o«‘ 


-yx 


(-r 


where a has a waeaning different from that imphed m equation 
(I). This type can be seen to be a particular case of Type I 
when (^2 becomes infimte 

Type V In this case, when the roots are real and equal, 
x-ha 


Iog!/ = Ji 
= |: 

- =1: 


1 (a; + 61/262) + ((1-61/262) 


(a; + 61/262)^ 


dx 


dx 




a — 61/26 


^ dx 


b^{x + 61/262) J b^{x + 61/262)2 

- i I«g(»: + V 26 .) - + constant 

1 Cb — 61/2&2 

2/ = 2/' (^ + 61/263)^26 &a(aJ+6i/2&2) 

= yQX~^e~y^^ 

Normal Curve of Error Putting 

61 = 62 = 0 
[x~\-a 


log 2/ = J- 


-dx 


ax 


= 7:7- + -7— + constant 
26o 60 

.<^ + <+ constant 

26 o 

or, by changmg the origin and altermg the constant, 

y = 2/0 

( 49 ) 


EFC 


4 



In a similar way the other less important transition curves 
can be obtamed. These are 


/, ' x\-”‘ /, a;\»‘ 

' (‘+«) ■ 




’■J'jcr 


and we reach J -shaped curves when in Type I pith or or w ^ 

IS negative and U-shaped curves when both are negative 
d -shaped curves can also be obtained with Type III. 

12^. A table is inserted which gives the list of curves with 
Pearson's numbering and with the origin as he generally uses it. 
This is convement because m reading other work on the subject 
it will be found that Pearson's numbering, etc. is usually 
adopted I have, however, added a note of the equation to 
each curve when the origin is at the mean. There is something 
to be said for uniformity as regards the origin, and the mean 
IS convenient because all distributions have means and tha<- 
moments are worked out about the vortical through the mean. 

A column in the table gives criteria to show which curve should 
be used in an individual case 

We may hero deal with a little diffictuJty that students some- 
times encounter in connection with types whicli may be 
expressed in the same algebiaic form (c g. Types VJH, IX 
and XI can all be written hx^) Tlio (piestion may bo asked 
why we should not fit hx^ from a to b and lind h, A’, a and b from 
the equations for the moments The answer is that the criteria 
afford in effect a simplification of the c(iuationB and auto- 
matically tell us a good deal about the value of the constants 
and the range of the curve. 

13. We shall return to some of the technical points when 
discussing numerical examples in the next chapter but may 
now recapitulate the method, and see the steps that have to 
be taken to fit a frequency-curve to statistics. 


(1) Arrange the statistics in sequence. 

(2) Calculate the moments about a convement vertical 

(3) Transfer the moments to the centroid vertical (vertical 
through the mean). 


(.50 ) 





(4) If there is high contact at both ends of the curve, 

apply Sheppard's adjustments to the moments (i e 
dedi^ci ^ and from the second and fourth 

moments respectively). If there is not high contact, see 
Appendix I ^ 

(5) Calculate the criterion 

(6) By means of Table VI decide which curve should be 
* used. 

* 

As an alternative to (5) and (6), the curve to be used can be 
found from diagrams m Tabhs for Statisticians, which show 
the type m terms of and /? 2 * 



CHAPTER V 


CALCULATION 

1 . The next point to be considered is the calculation of the 
constants for any particular distribution, when the moments 
have been calculated and the type to be used has been decided. 
The formulae required for the numerical work will be given for 
each type, a numerical example, including the calculation of 
the graduated figures, will follow, with the proofs of the 
formulae, 

2. Some general jiomts relating to tlic (uxiculation of the 
curves when the constants have boon found may bo con- 
veniently (tonsidcred heic. When the constants are known, 
we can calculate the ordinate for any value of x by substituting 
that value m the expression for the Iroqucncy-curvo, and if 
areas are required, some method of prcxwding from ordinates 
to areas must be found The most simple is probably to calcu- 
late mid-ordinatos, and then by the (luadiature formula (1) 
or (II) on p. 27 find the areas. It is occasionally more con- 
venient to calculate the ordinates at the beginning of each 
group, and then formula (111) should bo used Those formulae 
can be best applied in the form of diflcrences, thus, from (il) 
we have 

from (I) 

= 2/0 - V7ro{^y~i - ^2/o} + 5f6o{^2/~2 - 

from (III) 

( 52 ) 



Formula (II) is generally sufficiently accurate, while the others 
will be found to^give a result true to five figures in ordinary 
eases — exceptional cases will be referred to»m the numerical 
exaroples that follow. 

3. It is sometimes a help to see the graduation expressed 
graphically, and this has been done with some of the examples 
The best method is to insert a vertical height at the mode, 
note the ends of the curve, and the heights of the ordinates that 
have been calculated These heights give points on the curve, 
which can be drawn through them fairly easily In drawing the 
curve, as well as in calculatmg the constants, the sign of the 
skewness must be borne in mind, for it is possible to draw the 
curve with the skewness on the w^rong side of the mode, and if 
the distribution is nearly symmetrical, it is not so easy to 
notice the mistake as one might expect The tangent to the 
curve Sit the mode is parallel to the axis of x except m the case 
of the J -shaped curves or some of the less common transition 
types 

4. It IS best to draw on a rather large scale in order to gam 
distmctness, and the curves given here were drawn larger than 
their present size, the reduction bemg, of course, made in the 
process of reproduction 

The base elements should also be fairly large in proportion to 
the height, so that the curve may not ascend too steeply; 
otherwise small honzontal differences between the graduated 
and ungraduated curves are apt to conceal large vertical 
differences when the curve is rising or falling rapidly, but it 
IS the latter differences that are of importance. 

5. The reader should notice that all the cases considered 
in the following pages assume complete distributions, and it 
is in general only possible to find the curve from part of a 
distribution by means of successive approximation which is 
extremely laborious Another point, to which reference will 
agam be made, is with regard to grouping statistics, it is 
sometimes impossible to obtain many groups, but for accuracy 
in finding moments the greater the number of groups the 

( 53 ) 



better, unless the total number of cases is small A little 
discretion is needed in this respect, but in actuarial statistics 
which are sometimes based on as many as 200,000 cases, 
seventy groups might be used for great accuracy Ip. our 
examples we haver grouped merely to save work, space and 
printing, and the grouping does not alter the mctliod 

If there is high contact so that wc know the proper adjust- 
ments, grouping leads to little or no error An adjustment of 
one-twelfth to the second moment when ten ages are grouped 
and used as the unit has much more effect, projiortionately, 
than when only five ages are grouped or when individual 
ages are used The fear sometimes expressed that grouping 
destroys accuracy has no proper foundation in such cases, 
a little numerical evidence on this point will be found m 
Appendix I 

6. Another matter with whicli it seems advisable f'o deal 
here is connected with the criterion, /c. Tim may have any 
value from — oo to -}-oo, and from the following diagram it 
will be seen how the types cover all the jiosHible values of the 
criterion and do not overlap. 



- ori K 

0 K 

1 ^ 


h n(‘gaiiv(^ 


K>1 1 


Type 1 

Typ(5 IV 

1 

< 


Typo in Normal ciirvo Tyiw V Tyi)c Ul 

when ft - 
Type II 
when ft 4=3 

Just before /c = 0, Type T becomes nearly symmGtri(‘al, and 
after that value is passed we have a skew curve of unlimited 
range, and so on. At each critical jioxnt there arc one or more 
‘"transition” curves If by a mistake a student should use 
the wrong main type, he will find his mistake by reaching an 
imaginary quantity in one of the square roots which occur in 
the equations for the constants, but transition types can be 
used when the values of the criterion approximate to the 
theoretical values, they can, in fact, be viewed as approxima- 

( 54 ) 



tions which give an accurate result in a limiting case. It is 
impossible to give exact limits within which we are justified 
in usmg a transition type; theoretically, as shall see later, 
the justification depends on the size of the standard error of 
the function dealt with, but in practice can be guided to 
a great extent) by the size of the experience, if there are few 
cases, a larger deviation in the criterion will arise than if there 
are many Individual cases must be considered on their 
merits, but if the student finds himself in doubt he can aVoid 
using the transition t5rpe and be on the safe side in the matter 
of accuracy The student has one safe guide in every case, 
namely, that ''the proof of the pudding is in the eatmg” 
He should try transition curves in a few cases where he has 
little hope of their applicabihty and compare the results with 
those obtained by the right main types and he will then learn 
SuctTabout both classes of curves 

7. In the formulae that are given for the various types, the 
choice of sign for a square root depends on the sign of If 
the frequency is concentrated more closely before the mean 
than after it, the mode is on the left-hand side of the mean and 

is positive, the signs of certain constants m each type must 
therefore depend on the sign of pi^ in order that the mode and 
mean may lie in their correct relative positions. Where, 
however, no remark is made as to the sign of the expression in 
which a square root is given the positive root is implied, and the 
reader will find that these rules become easier to follow when 
he lias worked but two examples, one giving a positive and the 
other a negative value for pc^. Thus, if we imagine the frequen- 
cies in the example for Type I to be written in the opposite 
order 1, 3, 7, 13, etc , all the numerical work would be the 
same, but would be 2-776978, mg = 409833, = 13 52728, 

and Ug = 1*99638, and the graduation would be the same, but 
the numbers m the columns of the table on p 62 would run 
in the opposite order 

8. The arithmetical work is heavy and in some respects 
unfamiliar to most students There is no royal road to success 

( 55 ) 



in it except care, system and the use of common-sense at the 
final stage It is irritating at the end of a lengthy piece of 
arithmetic to find a slip at an early stage and to have to re- 
calculate, but these slips become fewer and of less importance 
with experience, for when we are in practice we suspect alarge 
error immediately an erroneous value is readied Personally 
I use seven-figure logarithms as a rule and put a check on 
every step, although not necessaiily to the last figure This 
plan-was followed with the arithmetical work in this book 
The check might not disclose a slip which did not affect the 
graduation or only affected the *final figures of a constant or 
coefficient Thus if the last three figures of log on p. 61 were 
wrong (which I have no reason to suppose) the mistake would 
be regrettable, but the graduation in the table on the following 
page would be unaffected Moreover, difficulty may be found 
in reproducing exactly the numerical result of another <salctr- 
lator, owing to the usual unreliability of the end figures when 
many operations have been made. In lengthy arithmetic the 
two final figures may bo unrehablo and two arithmetical 
processes may both be (‘orrect and yet give divorgeuciios This 
does not mean that five-figure logarithms are as good as seven, 
for if seven figures give five figures accniratcly, we asKSume that 
generally speaking five figui’o woik will only bo reliable to 
three figures. 


( 56 ) 



FORMULAE FOR MOMENTS 
ihes’e formulae apply to all the 

TYPES OF CURVES 


Vg = 

Vg = V3 — Zdv^ — 
j^4 = V4 - 4 dv^ — 
ox 8^ = d 

'>*'2 =- 2 ;S ’3 — (^(l+iZ) 


(v[=:d 

V 2 = 

^3 = ^3 “ ^^^2 + 
= j^4 — 4ciz^3 + 


=: 6/^4 - 3i^2( 1 + - cZ( 1 + d) (2 + d) 

V4 = 24/^5 — 2v 3{2( 1 + (i) + 1} — V2{6( 1 + dl) (2 + d) — 1} 

-^cZ(H-d:) (2 + c?)(3 + cZ) 

Sheppard’s adjustments when the 
curve has high contact at both ends 

or (standard deviation) = 

A = aI/aI • 

A = 

y?i(A+3)^ 

4 ( 4 yJ 2 - 3 A)( 2 A- 3 A- 6 ) 


/is = V3 

Ma = V 4 -ij/ 9 +-^ 


( 57 ) 



FIRQT MAIN TYPE (TYPE I) 


(*-aj 


where 


mjai — 

Origin at mode 


The values to be calculated in order are 


»-= 6 (/?,- A - 1)/(6 + 3 /? i - 2 /?,) 
"i + ''2 = 2 #2 VfAj (^' + 2)2 + 1 ()(f f 1 )) 
Tl>o '/w’m are given by 

^ jr - 2 + r{r + 2) J ^ 2)2 + I (i(r + I ) 
when 7/3 is positive is the positive root 


N ni 1 1 -f H- 2 ) 

(ii + (U;^ * {vii ' ^'^-5 / \m i 4- 1 ) i '(Wa + ^ ) 


Mode = Mean 

2 //.2 r-2 ^ 

If expressing curve with origin at mean (see Table VI facing 
p. 51) 

’ 4 “ ^ 2 = 4 * q>2 

(mi+l)/A = (m2+l)//Ia 

_ N (mi + I )"’i(m2 + 1 )™“ r{mi + ma + 2) 

^“”^2 + ^ 2 ' (m^ + m 2 + 2)^'‘i+'"2 ■ T(mi + 1 ) r^ma + 1 ) 

For table of F functions see p 266, or Tables for Statisticians 


( 58 ) 



NOTES 


The usual shape of the curve is like that of the following 
example, but if and are approximatqjy equal it is nearly 
symmetrical, ff and are not small it tails off at both ends, 
and if both and are small it rises abruptly at both ends. 
If IS negative the curve is J -shaped, it starts at an infinite 
ordinate, falls -rapidly and runs out at a fixed point ‘(for 
numerical example see p 126). If both and are negative, 
the curve is U-shaped, starting and ending with infinite 
ordinates and having an anti-mode instead of a mode as the 
usual origin (for numerical example see p. 112). In the J- and 
U-shaped curves, though the ordinate is infinite, the area is 
fimte Care is needed in these cases when taking out the F 
ftmctten for r{t) is required when if < 1 and the tables give 
log /"(!-}- 1), i.e , log t + log r{t). In the case of J-shaped curves 
it IS best to use the form with origin at the mean or express 
the curve in the form with the origm at the 

start of the curve and 

, _ Ff jr(mi-f-m2 + 2) 

r{m-^ 4- 1 ) r(m2 + 1 ) 

An interesting variant of the J-shaped curve arises when 
and are both arithmetically less than umty and one of them 
IS negative. The shape is then hke that of No (11) in the 
diagram of curves at the end of the book, i.e it is of twisted 
J -shape (for example and further notes, see pp 111-3) 

EXAMPLE 

As an example of this type the figures in Table I (Example II) 
may be used The moments were first found by the summation 
method (see Chapter III, § 9) as shown in the following table 
The reader can check the result by recalculatmg the moments 
by the more direct method, taking age 42 as the arbitrary 
origin. This is how I should myself usually do the work; I only 

( 59 ) 



use the summation method when the senes is a very long one, 
and I give it here merely by way of example.^ 


Central 

age 

ol 

gioup 

H3xj)osed 
to iisk 
Example 1 T ^ 
of Table I 

Fust 

Hian 

Second 

sum 

adiirci 

sum 

r 

Fouith 

sum 

17 

34 

1,000 

5,175 

19,809 

64,389 

22 

145 

96b 

4,175 

14,634 

44,580 

27 

156 

821 

3,209 

10,459 

29,946 

^32 

145 

665 

2,388 

7,250 

19,487 

37 

123 

520 

1,723 

4,862 

12,237 

42 

103 

397 

1,203 

3,139 

7,375 

47 

86 

294 

806 

1,936 

4,236 

52 

71 

208 

512 

1,130 

2,300 

57 

55 

137 

304 

618 

1,170 

62 

37 

82 

167 

314 

552 

67 

21 

45 

85 

147 

238 

72 

13 

24 

40 

62 

91 

77 

7 

11 

16 

22 

29 

82 

3 

4 

5 

6 

7 

87 

1 

1 

1 

1 

— 

Totals 

1,000 

5,175 1 

1<),809 

()4,389 

l8(Mb38 


5175/1000 = 5-175 

*%= 10800/1000 = 10-800 
()4;}80/I000 = 04 880 
180088/1000 -= 180-088 

The next stej) is to lind tlio moments about the centroid 
vertical using tlio formulae on p 57, a.nd, in this case, as no 
adjustments* were made m the moments tlio r’s and /a’s are 
the same. 

/<2 = 7-00237 //i = -5072955 

//.3 = 15-1060 y?2 = 2-985 J 10 

/^.4= 172 326 

From the values of /?j and the criterion (x) can be 
calculated, and its value being — 2645 shows that Type I 
must be used (see Table VI) 

* The moments should have been adjusted by one of the methods suitable 
when the curve is abrupt These have been discussed smeo the example was 
prepared, and it was unnecessary to lecalculate — see, however, Appendix I 
Similar qualifications apply to a few of the other examples 

( 6o ) 



r = 5-186811 logr = 7149004 
r+l = 6-186811 log(r+l)= 7914669 
r + 2 = 7 186811 log(r + 2) = »565363 

* r-2 = 3-186811 log(3' - 2) = -5033563 

The values* of log(r+l )5 etc were checked by a Gauss- 
loganthm table. 

+ 15-52366 = 1*99638 

mi =• 409833 = 13*52728 

7712= ^’'776978 ]^ean~-mode = 2*223116 

It will be noted that the expression V{A(^ + 1)} 

occurs in both the values of (^i + ag) and m. 

The mean is at age 12 + 5*175 x 5 = 37*8750, and the mode 
at age 37 8750-2 223116 x 5 = 26 75942. 

JCho^skewness is *8032. 

The calculation of log^Q is as follows. 

logN = 3 00000 
colog(ai + a2) = 2*80901 
mi log mi = 1 84123 
mglogmg = 1 23179 
colog(r — 2)^-2 = 2 39590 
log r(r) = 1 50406 
colog /’(mi + 1 ) = * 052 1 9 
cologr(m2+ 1) = T* 34037 
log 2/0 = 2*17455 


where, of course, log /(mg + 1) = log /(3*776978) = log 2*776978 
+ logl 776978 + logjr(l*776978), the last value being taken 
from the table at the end of the book 

The work to this point gives as the curve for graduating the 
statistics 

409833 r ^ ^2 776978 

i^“l3 52728/ 


y = 149 47 1 + 


where the origin is at age 26-75942 and the unit is five years. 

( 6i ) 



The following table shows the calculation of ordinates of the 
curve from the equation just given 


Vge 

(1) 

1 h-f 

(w>) 

i-i- 

<0 

(3) 

l;«(3) 

(1) 

1<'K (3) 

(r>) 

Wj Xtul ( 1) 

(<'0 

xeol (")) 

(7) 

f'ol (0) 

“8 col (7)1 

1 7o 

1<»8" 7, 

(H) ^ 

(‘1) 

(10) 

n 

02228 

1 1 1 129 

2 31792 

0 0585 4 

1 3229 

0 1()20 

1 t)0()l 

15 7 

44 

22 

52319 

1 07037 

I7I80O 

02955 

18817 

0821 

2 1 10 1 

138 2 

137 

27 

1 024^0 

99014 

0 01034 

1 99815 

0(X)12 

J 9957 

2 1715 

140 5 

149 

32 

1 52501 

92252 

18327 

9t)498 

0751 

9027 

2 1525 

3421 

142 

37 

2 02592 

84859 

30002 

92870 

1257 

8020 

2 1023 

120 0 

127 

42 

2 52()83 

77400 

40257 

88911 

1050 

0921 

2 0317 

107 0 

108 

47 

3 02774 

70074 

48111 

84550*" 

1972 

5711 

1 9429 

87 7 

1 88 

52 

3 52805 

02()81 

54700 

797J4 

2244 

4307 

1 8357 

08 5 

09 

57 

4 02950 

55289 

00520 

74204 

2481 

2853 

1 7080 

510 

51 

02 

4 53047 

47890 

()5015 

08030 

2(>89 

1J22 

1 5557 

30 0 

30 

07 

5 03130 

40504 

70l(>9 

00750 

2870 

2 9KK) 

] 3722 

23 0 

24 

72 

5 53229 

33U1 

7 1291 

51997 

3045 

{)070 

1 1101 

14 0 

14 

77 

() 03320 

25719 

78055 

41025 

3199 

3()23 

8508 

72 

1 7 

82 

t5 53m 

1832() 

81519 

2()3()7 

3311 

3 9535 

1()22 

29 

1 3 

87 

7 03502 

10931 

81720 

03H78 

3 172 

3307 

1 8525 

7 

] 

!)2 

7 53593 j 

03511 

87714 

2 51913 

3595 

59709 

3 5()5()^ 




Cols (2) and (3) have a constant first dilForonco, viz 
Ija^ or 500907, and or *073925. The value at any point 
having been calculated and chocked, the other items are 
formed continuously Cols (4)-(9) explain tliemselves, but 
we may remark that it is generaJly advisable to use a larger 
number of figures than five in taking logarithms, especially 
if or m 2 is large A little care is noc^essary in multiplying 
such numbers as T*71 86() by mj^(*409833) IF an arithmometer 
is used, Ml is put on the plate, and is multijihed by — *28134, 
and the result — llt53 must bo put in tho^lbrm 1*8847, to 
enable us to add it to other logarithms Col ( 10) gives the area, 
and was formed by ajiplying one of the formulae on p. 52 
The area of the first grou]) must bo treated separately, as the 
curve starts at ago 16*7775, and the base ol the grou]) is there- 
fore 2*7225 m length, instead of 5 years as in the other cases. 
A good way to find the area is to calculate the ordinates for 
the middle and ends of the base, and apply iSimpson’s rule, viz 

( 62 ) 




remembering to multiply the result by — - — to allow for the 

o 

different length o'f the base 

The mid-ordmate is 92-1, the ordinate at the end of the base 
IS 11 6' 5, and the ordmate at the start is of course zero, the 
area is approxanately* 

2 7225 

— — xi{0 + 4x92 1 + 116 5} = 44 
o 

» 

Some people l&nd it better when calculating the ordinates 
to use the form given in the JSTotes on p. 59, with the ongm 
at the start of the curve; it avoids bringing in the reciprocals 
of and ag The columns of logo; and \og{a^-\- — x) can be 
formed continuously with the aid of Gauss-logarithms The 
imtial values wiU have to be calculated and as a check one or 
perhaps two other values 



17 22 27 32 37 42 47 52 57 62 67 72 77 82 87 92 


PROOF OF FORMULAEf 

( QC \ Wlj / ^ \ ^2 

1 + —) ll 1 , 

where mj % = . 

* For greater accuracy use more ordmates or Tables of incomplete B -functions 
t The reader wbo has bttle acquamtance with formulae of reduction and the 
jT and B functions, should consult Appendix II befoie reading the proofs of the 
formulae for this and the other types 

( 63 ) 



T 4. , r j ®1 + ® 

Let a, + a, = 0 and z = — . 

^ “ ai + «2 

The area from x = —a^io x — + ^^ 2 total frequency N, 


=J: 


dx 




(% + a:)"'i (Oj — *)”*“ 


2/o 


JoafxafH 

yo(ffli + m2)»‘i+'”»K + a2) 


\z{a^ + fta)]'"! [( 1 - z) («! + aa)]'"^ (a^ + aj) 
z»«i(l-z)'«xcZz 


Or ?/o 




N 

b (Ml + f ^ ^ + ^) 


jB(mi+ I, m2+ 1) 
/"(mj + m2 + 2) 


Uaing the name method for tlie moments as that just given 
for the area, we see that the nth moment, about the line 
])arallel to the axis of y through x is 

"" J-a of^a^ ^ ~ 

_ + m2)"^i''' A^'i + ^?^ + 1 ) ^ ) 

"" ’ i'(mi + m2H- + 2) 


Now, since /'(j?) = — 1), the moments about the 

line parallel to the axis oiy through x ^ are as follows: 



6(mi+l) 

mi + m2 + 2 

6=K+ ^ + 3) 

(m-i + mg 4- 2) (m^ + mg + 3) 


Changing the origin m order to get moments about the mean 
and writing m^ = m^ + 1 and mg = mg 4 1 and r = 4 w^g, we 

have 


( 64 ) 



^2^=r2(r+l) 

— m\) » 

/^3= V3(r+l)(r + 2) 

^ _ 36^mim2{mim2(r — 6) + 

^^4 (r4-2) (r + 3) 


We can simplify these expressions to obtain the equations on 
p 58 by writing,/?! = A = e = then 


_ 4(r2-4e)(r+l) /?i(r + 2)^ ^r^ 

e(r + 2)2 4(r4-l) e 


and 


/?2 = 


3(r+ l){2r2 + e(r— 6)} 
e(r + 2) (r + 3) 

;ff2(r + 2)(r + 3) 


or 


EhroHJIating r^je we find 


3(r + l) 


0^2 

— +r-6 
e 


A(^ + 2)3 ^,(r + 2)(r+3) 
2(r+l) 3(r+l) 


Dividing out by r + 2 we have 


3/?i-2y?2+6 


From the equation 


igi(^ + 2)^ 
4(r+l) 


^2 

4 we have 

e 




6 = 


r2 


4 + |/?i 


(r + 2)^ 
r+ 1 


and from the equation for 


e 


The other equations follow at once from r = and 

€ = The distance between the mode and mean is 

/^! = — 6m!/(m! + m 2 ), which can be easily reduced 

to the form given A general value (regardless of type) for the 
distance was given in Chapter IV, § 5 

( 6s ) 


EFC 


5 



SECOND MAIN TYPE (TYPE IV) 

2 / = 2/0(1+^) 

Origin is vafr after mean 
The values to be calculated in order are 


4'* 

1 1 

1 

1 

1) 


/ 

IL 

1 1 

I 

• a 


m 




P 

r{r — 

2)VA 



- V(HKr- 1) 

-Kir- 

-m 

a 

- 

r-])- 

S^' 

1 


N 



Vo 

~ aF(r, r) 




Mode = mean — ^ - 

2 /< 2 ('/+ 2 ) 


( 66 ) 



NOTES 


The curve is skew and has unhmited range in both directions 
fjb^ ^nd V have opposite signs, i e when is positive v is 
negative ^ 

A simple way to calculate the curve is to put it m the form 
X = a tan 6, y — cos der^^ 

Then d is taken* as 10°, 20°, 30°, etc , and x and y found, Ihis 
gives corresponding values of x and y, but the values of y will 
not be equidistant values of x In calculating the value 
of 6 must be taken in circular measure. If equidistant ordi- 
nates are required to be calculated accurately, little is gamed 
by the double form, and if we had good tables of log (1 
and tan”^ x, the calculation of a particular ordinate would be 
a very simple matter. The calculation and meanmg of jF(r, v) 
are dealt with m the proof. The log of this function is tabulated 
in Tables for Statisticians When r is fairly large a close approxi- 
mation to 2/o 5 where tan ^ = vjr, is given by 

cos^(?S 1 

N I r e Ur 

a a/ 27r (cos $5)^+^ 


We appear to reach the expression that looks shortest and 
simplest with the origin as shown on the previous page, it has 
generally been used and it is therefore given. This origin has, 
however, no physical meanmg and there is much to be said 
for using the more complicated looking form with the origin 
at the mean, namely 

2/ = 2/o 1 1 + ^ j I 


see table facmg p 51. 

The value of this expression when a; = 0, i e. the value of the 
ordinate at the mean, is 

i,2\ —m 


2/0 l + ;:i = 


cr H{r, v) 


( 67 ) 


5-2 



where H{r, v) is a function related to F{r, v) Its logarithm is 
also tabulated m Tables for Statisticians The reader will 
ai^preciate at oi^ce that this curve needs considerable care, 
it is the most difficult of all the Pearson-type curves 


EXAMPLES 

The numbers in the following nearly symmetrical distribution 
represent the exposed to risk of sickness by Sfitton’s Sickness 
Tables (males — all durations) ^when the number of weeks’ 
sickness is represented by the normal curve of error. 


Central age 

No exposed 

Graduated by 
Type IV 

5 

10 

6* 

10 

13 

16 

15 

41 

49 

20 

115 

135 

25 

320 

321 


()75 

653 

35 

1,113 

1,108 

40 

1,528 

L5:i5 

45 

1,692 

1,712 

50 

i,5:io 

1,522 

55 

1,122 

1,074 

60 

610 

004 

65 

255 

274 

70 

86 

102 

75 

26 

32 

80 

8 

8 

85 

2 

2 

90 

1 

1 

95 

1 




9,154 

9,154 


* This group has been taken as the area of the rest ot the curve 

The following values were obtained: 


Mean = 44 5772339 

A = 

-0053656 

/t2 = 4 527608 

/?2 = 

3-169897 

/fg = --705687 

K = 

•0125 

= 64-98048 



f 68 

) 





Type IV was used because, as there is a large number of 
cases, the standg^rd error of k will be small (see Chapter X) 

r = 40 12143 

V = 4-450398 (positive because /ij is negative) 
a =’13-39152 
TO = 21 06072 
Sk =--03313 

When the 5-years umt with which we have been working 
IS changed to one year, a becomes 66*9576, and = 4483*325 

The origin = mean + m/r 
= 52 504394 

The mode, which is wanted if the curve is drawn, is at 
44*9^89 

As r IS large the approximate form for Vr. was used, 

4*450398 1 f ^ 1 + r-io' i 

40 12143 ’ log tan = log tan 6° 19' hence 

log cos $5 = 1*9973446, andfrom this j^qIS found to be 273*3649 
The value was checked by the tables in Tables for Statisti- 
cians 

The calculation of ordinates by the double process is as 
follows: 


d 

X 

m years 
of age 

-4 450398 0 log e 

% 

42 12143 log cos 6 

logy 

y 

00 

0 



2 43675 

273 37 

10 

1 1687 

I 96637 

I 99721 

2 40033 

251 38 

20 

2 3382 

I 93253 . 

I 98885 

2 35813 

228 10 


The second column is formed directly from the tables of 
tan (9 by multiplying by a, and as x is required in years, 
13*39152 X 5 = 66 9576 should be used for a The fourth 
column IS formed by multiplying log cos (9 by r-|-2, and the 
third continuously by addition When 6 is negative, the third 
column has to be subtracted from the fourth: i e.it ceases to be 

( 69 ) 




negative and becomes positive In each case the fifth is formed 
from the fourth + the third + log ^ 

When drawing a curve of this type the position and height 
of the mode can be noted and then corresponding points 
inserted, eg \ 1687 and y — 251 38 Care must be 

” r 

taken to give the curve its maximum at the right pomt 
If the calculation is made directly, the following columns 
can be used 


ria 

1 

log (l+z-la-) 

ija 

indegiees, 

etc 

col ( 4 ) in 
eiicuUi 
measuie 

r ‘=“1 (5) 

''( -i- logic <■) 

— ?>ixcol (3) 

+ ( 0 )+( 7 ) 

2/ = 

antilog 

( 1 ) 

( 2 ) 

(3) 

( 4 ) 

( 5 ) 


( 7 ) 

(8) 

(9) 






i 

1 

i 



Col. (2) (*an be formed by differences since Zl(l+X^) 
= 2X+ 1 , has to be found by using a table of the 

tangents of angles inversely A table helpful for obtaining 
coL (5) from col (4-) will be found in Chamh(\m^ Mathematical 
Tables or in Tables for Statisticians. 

The troublesome work of inverse interpolation in degrees, 
minutes and seconds can bo avoided by numbering the items 
in a table of tan <9 from 0 onwards Chambers' Tables, for 
instance, give tangents for each minute in the following form* 


1 

0^’ 


2“, (‘tc. 

0 

OOOOOOO 

(00) 017455] 

(120) 0349208 

1 

0002909 

(01) 0177460 

(121) 0352120 

2 

0005818 

(62) 0180370 

(122) 0355033 

3 

etc 

0008727 

(63) 0183280 

(123) 0357945 


( 70 ) 



If in the column headed 1° we insert 60, 61, etc , and in the 
column headed 2° we insert 120, 121, etc , as indicated by the 
figures in brackets, we can make the invei^e mterpolation 
in mi^iutes. Then, as one minute in circular measure is 
•0002908882, we can obtain the figure we ^require by multi- 
plying by the conversion factor In practice however it would 
be combined into one multiplier with ( — vlog^oe) and col (6) 
would be found directly from col (4) by multiplying, in our 
example, by 003519003 The labour of msertmg the mmiltes 
in a printed table is small, as all we need to do is to write the 
number of minutes under the iJumber of degrees at the head of 
each column and add thereto at sight the marginal minutes 
when the interpolations are being made. 

Tables of tan~^ d, etc will be published shortly {Tracts for 
Gomputers, No xxm) and these tables will simplify the cal- 
culations . 



( 71 ) 



PROOF 


In 


1 + 


Now 


y = 2 /o{i +^2 

. X 


i2'\ —m QT- 

g-ptan-i^M pxit tan <9 == - 


d == taii”^ - and ^ 

a 

2^ —m 

= {l + tan^^}““ = (sec^i?)"”' = cos^"*0 


y = yocos^’^de"”^ 


-r: 

rirr 

J -i 


— m 


2/0 {1 + ^2 


tan-1 


^/‘‘dx 


y^ cos^™ 6e~''^ cos^ substituting 

dx 


tan - so that = a sec^ 6 = — r-j, 
a dO cos^ 6 

riTT 

= eoB^ (10 wliere r = 2m — 2 




J -Stt 

iPTT 


J sin’’ (l)ey'!‘ d<j>. 


substituting sin^ for ooaO so that ^77- = /^ + ^ and changing 
limits, = yQaF{r, v), say. 


The wth moment about the origin is 


1 T” 

< = jy 1 

1 f« /, x^' 

■sj 




_ 1 

_ 2/0®”'^^ 
--jT-J 


<9 tan’^ Oer^^^dO, by substituting as above, 

COS^ ^ Q(,-vO f^Q 


Itt 


2/0 r cos’"-" d sin” 
N _ r— n+1 


- J I — — — [sin^“2 6 cos Oe~^^{n — 1 ) ~ sm^ 

( 72 ) 


-1(9] dd 
) J-i’’ 



by integrating by parts and treating as one part 

and cos'’”® d sin d as the other, and remembering that 

cos’-’^ d sin ddd = - f . 

r — n+l 

Now, since cos’'“™+^ d sin““^ de""® = 0 whpn d becomes \7t or 
— Jtt, we have 


K = 


N{r — n+ 1) 


/'^TT 

{{n—l) cos^'"^+2 6 de~^^ 

J 

— V 008^-'^+^ d sm^~^ 6e~^^} dd 


Further, 


Ml 


= cos’- d tan de-’'»dd 

- N J-i„ 


m!/_ 

Nr 1 


riTT 

J -in 


V cos^ dd 


by putting n = 1 in the above equation for 


av 


because N 


rin 

= 2 / 0 ^ c 
J -in 


cos^de~^^ dd 


Using the last result with the formula for the nth. in terms 
of the two previous moments, and remembering that is unity, 


/^2 


r{r-iy^ ^ 


/^3 = - 


a^v 


r{r— 1 ) {t—2) 


(3r~2 + v2) 


- {3r(r — 2)^v^{Qr—8) + v^} 


r(r“ 1) (r — 2) (r — 3) 

Referring these moments to the centroid vertical, we have, 


by putting d = = — — m the formulae on p 57, 


( 73 ) 



a- 


/'= = ?v3r)<’-‘+‘'') 


/h 


r^{r-l){r-2) 

3aV + )>g){ (f + C) 

J) (r— 2) (f — 3) 

If now, we put z for and write as before, 


/U = 


2 


and A = 

^ /4 


we have 


and 


_ R?f_0 
2(r-]) 2 ^ 

/4(,^_2)(r-3) 8r2 


Adding and dividing out by r — 2, we have 


and 


^2 

lO(r-l) 


Finally, since = ^ — r^, the other formulae on p. 66 follow at 

once 

Since the tangent at the top of the maximum ordinate is 
parallel to the axis of x, the position of the mode is such that 
dyjdx IS zero at that point, i e. 


;2\ ~(W M) 


Ik 1 + 


/2 


C‘ 




IS zero There are three cases, a: ~ — oo, a? = + oo, and a value 

'^TH/X l>^Ct 

of X such that + - is zero, or a; = — — The distance of the 
a 2m 

mean from the origin is or — — , and, therefore, the distance 


( 74 ) 



between the mean and mode is — 7 ^— —, which reduces to 
j r(r-f2)’ 

the expression given on p 66, when the vajues for v and a, 
on th^ same page, are inserted 

It will be useful to give another example of the calculation 
of 2/0 for curves of this type, and we may take a curve where 
r = 29 590, = 19 886, a = 13 650, N = 2162. Hence tan ^ 

= 67205, ^ = 33° 54'^, cos^= 82998, logcos$S = l 91907, 
and ^ m circular measure is 59172. 




Ic^gN 

= 3-33486 



colog a 

= 2-86486 



|logr 

= 73557 

cos^^ 

00776 


= T-60091 

1 

- -00282 



II 

1 

-H 76700 




-11*762 

xlogioC 

= 6-89183 


colog(cos?5)"+i = 2 47564 
T-90367 
2/0 = ~ 8 ^ 

The form just considered is sufficiently accurate for all 
practical purposes provided p is not very small If y < 2 the 
tables in Tables for Statisticians must be used 


( 75 ) 



THIRD MAIN TYPE (TYPE VI) 

r 

y — yQ{x—aY'‘x~'^' '' 

Origin at a before start of curve 

r 

r 

The values to be calculated in order are 

6 + 3y^l-2/?2 

“ = 2VA'2V{A(^ + 2)^+16(r+l)} 

and - (j'l given by 

r — 2 r(r + 2) / 

Y”' V A(^T^)‘''Ti6(rTT) 

Nai'-'h~^ r{q^) 

1 j 

Origin = Moan — ~~~~ — -4 

qi-q^-2 

Mode = Moan — ^ - ^ f 
2 /^2 r-2 

If expressing curve with origin at meait (see Table VI, 
facing p 51): 

. ^ ?(9'i-l)_ A ^ + 

= -^(g2 + 1 y‘^ {<li - ga - Agi) 

®(gi - 1 ^’(gi - ga - i ) Aga + 1 ) 


( 76 ) 



NOTES 


The range is from o to oo and the method is hke that of 
Type I If /ts IS negative, then a is negative and the range is 
from — 00 to — a. » 

T IS always negative and is greater than If gg is negative, 

the curve is J -shaped 

The value of does not correspond to any frequency, as it 
relates to a poiift before the curve starts 

The reader will probably find it easier to work with the origin 
at the mean, and in the numerical example both forms are 
shown 

EXAMPLE 

The number of entrants, limited payment pohcies, 1863-93 
experience was summed in groups of ten years of age and 
divided by 100, and the following series was obtained: 


No of entrants 
-100 

Graduated by Type VI 
curve 

1 

1 

56 

50 

167 

168 

98 

100 

34 

36 

9 

10 

2 

2 

1 

5 

368 

368 


The moments,^ etc. were 

Mean at -402174 after the centre of 167 group 


>“2 = 

928835 

l-qi = 

-41-03080 

/^3 ~ 

•893096 

1 +S'2 = 

7 60950 

/<4 = 

4 088800 

9'i = 

42-03080 


■9953605 

?2 = 

6-60950 

A = 

4 739349 

a = 

10-37947 

K — 

r = 

1-895 

-33-42129 

log 2/0 = 

46-1821 


( 77 ) 




The origin is 12-74270 before the mean or 12-34053 
before the centre of the 167 group, and the curve starts at 
12 34053 — 10 37949 = T96106 before the centre of the largest 
group This makes the start of the curve at about a*ge 10, 
which IS reasonable 

If we use the origin at the mean, we have 

Ai = 12-74270, = 2-36324, = 147-4 

and the range is from — 2 36324 to oo 
The curve was calculated as follows 


X 

(1) 

log X 
(2) 

log(x-a) 

(3) 

-ffjloga; 

(4) 

g:^log(v-a) 

(5) 

logy 

(6) 

y 

(7) 









There is no difficulty iii writing down the values for cdts (2) 
and (3) without using col. (1), as only the whole numbers m 
cc and x — a change, the decimal remaining constant so long 
as equidistant ordinates are required. Cols (4) and (5) are 
obtained directly, and col. (6) by adding cols (4) and (5) to 
logj/o Cols (2) and (3) can be formed continuously with the 
aid of Gauss-logarithms 

The mode which is useful for drawing the curve is *02429 
before the centre of the largest group. 



( 78 ) 



With the origin at the mean the form of the columns is 
similar to that already shown for Type I 


, PROOF 

f*co 

^ = L yo{x~a)^^x~^^dx 

= J (z ~ ^) * ^ ~ 

by substituting Ijz for x/a 
— j Vo 1 — z)32 dz 

N 

~ a92-2i+i B{q^+ 1, g-i-ga- 1) 

_ -y-r(gi) 

~ ^(?2+l)-^fe-9'2-l) 

The wth moment about the origin is 

1 r“ 

K = ^ J 2/o-'«™(® - a)'i^-x-^idx 

^ Vo r{^i-q%-n-i)r{q^+i) 

JSf 

by the same substitution as that used above. 

From this last result we obtain, by msertmg the value of 
and remembering the relationship between r(q-^) and r{qi — y , 
etc , 

A 




/^2 


etc 


e'i-9'2-2 

«^(gi-l)(gi-2) 
(9'i- 22-2) tei-g-z-^) 


It will be noticed that these equations are the same as those 
obtamed for Type l-dm-^ = -q^,m^ = q^ and h=a Thus, we 
can use the whole of the Type I solution, provided we bear m 
mind that the range is from a: = a to a; = oo 

( 79 ) 



TRANSITION TYPE 


“NORMAL CURVE OF ERROR 


c — 2cr^ 

N 

~ ^J(27r/i2) 



JSrOTES 

This curve has been known by various names, such as the 
Probabihty Curve and the Gaussian Curve ^It was discussed 
before Gauss by de Moivre and Laplace. It is the limit of 
{p + qY^ where p-^q— 1, when n approaches infinity and if 
neither ^ nor ^ is very small. It gives a close representation 
of (i + 1)^ even when n is not large 


^ EXAMPLES 

The following table gives, m col. (2), the sums assured and 
bonuses, and m col (4) the reserves resultmg from grouping 
a number of Endowment Assurances according to their office 
years of birth* 


Central age 
for groups 
of 5 years 
of birth 

(1) 

Sums assured and bonuses/1,000 

EiESeeves/ 1,000 

Ungraduated 

(2) 

Graduated 

(3) 

Ungraduated 

(4) 

Graduated 

(5) 

17 

11 

13 

6 

•6 

22 

48 

40 

28 

28 

27 

124 

104 

115 

10 9 

32 

213 

202 

27 7 

30 1 

37 

281 

282 

59 1 

58 4 

42 

295 

288 

84 7 

79 9 

47 

185 

214 

74 1 

77 0 

52 

104 

116 

50 5 

52 2 

57 

40 

44 

23 2 

25 0 

62 

15 

13 

12 2 

84 

67 

3 

3 

13 

24 

Total 

1,319 

1,319 

347 7 

347 7 


% 


The following table shows the moments and constants 


Constant 

Sum assured and bonus 

Reserves 

Mean age 

1^2 

P'3 

Pi 

A 

h 

K 

cr( = Vj^2) 
a ^ 
t/o 

39 202426 

3 066840 

650127 

27 02516 

014653 

2 873346 
- 005 

1 751237 
5710248 

300 4760 

43 967213 

2 769635 

029805 

22 40663 

0000418 

2 920997 
- 0002 

1 664222 
6008813 

83 34959 


EFC 


( 8i ) 


6 





The criteria for the normal curve are /c = 0, /?i = 0, and 

= 3. The values given above do not differ very greatly from 
these, but a corp.parison of the graduated and ungraduated 
figures shows that the reserve curve agrees better th^^n the 
sum assured curvp, partly because the value of /4 is closer 
to 3, and /?i has a larger value in the case of the sum 
assured 

For the calculation of the value of 

colog a/2^ = T 6009100657 

is required 

In finding the areas for the comparison between the 
graduated and ungraduated figures it is unnecessary to 
calculate the ordinates, as one of the calculated tables of the 
probability integral can be used. The table by W F Sheppard 
included in Tables Jor StaMsheians is very convenient, and the 
columns in the table below show how it was used to calculate 


Age 

X 

Distance from 
origin in 
calculation 
iiniiH, 1 e 

5 years of ag(‘ 

rrovioua 

column 

X a*"’ 

Values of 
^ i(l-l-a) 
fioin Sheppard’s 
tables using 
diffcioncea (aiea 
horn oiigin to,^) 

Difference 
of previous 
column 
— area foi 
ago giou]} 
a to a; -i- 5 

Area 

multiplied 
by 347 7 
(total) 
fioquency) 

14 5 

5 893443 

3 541258 


00104* 

6 

19 5 

4*893443 

2 940377 

99830 

00802 

28 

24*5 

3 893443 

2 339490 

99034 

03139 

10 9 

29 5 

2 893443 

1 738015 

95895 

08057 

30 1 

34 5 

1 893443 

1 137731 

87238 

16806 

58 4 

39 5 

•893443 

•530853 

70432 

229851 

79 9 

44*5 

100557 

064028 

52553 

‘^2141 

77 0 

49 5 

1 106557 

604909 

74094 

15018 

52 2 

54 5 

2 100557 

1*205790 

89712 

07190 

25 0 

59 5 

3 100557 

1*800071 

90902 

02418 

84 

04 5 

4 100557 

2 407552 

99320 

•00572 

20 

09 5 

5 106557 

3 008443 

99892 

00108* 

4 


* Eemamders of areas beyond 19 5 and 09 5 

t (*70432- 50000) “{“( 52553 - 50000) because we j>ass acioss the ongm, and 
a piece of the group is on each side of it 


the areas m one of the cases (the reserves) Sheppard’s tables 
give the areas and ordinates of the normal curve in terms of 

( 82 ) 




the standard deviation, that is, he assumes the standard 
deviation to be umty, and his tables must be entered by 
using intervals df (T~^ A short abstract from Sheppard’s table 
IS given on p 265 Most other pubhshed tables are based on 
the standard deviation multiphed by and the distmction 
must be born© in mind if other tables are used 

The second column can be left out when the method has 
been grasped The ages m the first column were taken con- 
sistently with ^the assumptions that 17, 22, etc. were-, the 
central ages of the groups 

“NORMAL CURVE OE ERROR” 

Sums 

Assured Reserves 



If ordinates are required, the z column m Sheppard’s tables 
must be used. It was with its help that the curves in the figure 
were drawn The statistics and curve for the reserves are 
shown by the dotted hnes. 


( 83 ) 


6-a 



An average reserve for any group can be obtained by means 
of the graduated figures, and it could be used to test the reserves 
obtained at any future valuation* This is by Jto means the only 
rough check that" can be ap^Dlied, but it is interesting because 
it shows a use to which frequency^curves might be put m 
practical office roiitine 


To show that 


let 


PROOF 

' 0 / Jtt 

0 ^ 


= K 


then, substituting ax for x, we have 


j: 




Hence 

But 

and 

Hence 


''oo 

e 

J 0 


poo 

J 0 , 


: I cn adx = /ce-“" 

o poo 

adadx _ ^ cr^^da = 

) J 0 




j: 




1 f^_dx 

2J0 1 




A = 


TT 


or K 


1 1 
2 1+x^ 




poo 

e~^^dx = ^71 

J -00 


( 84 ) 



The other constant is obtained as follows- 


r 

c J_„ 


x^er^^'^dx 


N == 


2N 


C = 2/42 


by parts 

00 


( 85 ) 



TRANSITION XYPE (TYPE II) 

Ongm at mode ( = mean) 

"''“2(3- A) 

,A - 

~ ‘i-K 

Nxr(2m + 2)_ 

“ ax22'»+'fr(m+l)P 

N r{m+l},) 
a^n r{m+l) 


( 86 ) 



NOTES AND PROOF 


Put /?! = 0 in Type I, for the curve is symmetrical, and 
therefore 7^3 = 0 For the same reason it is clear that 

Approximations to F may be used if m^iis large. 

If m IS positive, the curve starts at zero, rises to a maximum 
and falls again to zero , but if m is negative, it starts at infinity, 
faUs, and then rises to infimty again, 

EXAMPLE 

In the discussion that followed the reading of G J. Lidstone’s 
paper on Endowment Assurances, G F Hardy said that “the 
errors in the successive groups formed a curve very similar to 
the normal curve of error” ( J. Inst Actu. xxxiv, 87), and the 
series in question is a rather interesting example of a sym- 
metrical distribution 


Unexpired term in years 

Error involved in using 
“mean age” method 

0-4 

11 

5-9 

116 

10-14 

274 

15-19 

451 

20-24 

432 

25-29 

267 

30-34 

116 

35, etc 

16 


1,683 


Moments wd!:e calculated about the centre of the 15-19 
group, and 4985146, 2*161022, 3 104576, and 12*60666 were 
found for the first four moments, transferring to the mean 
(17 5 + 2 492573 = 19*992573), and using Sheppard’s adjust- 
ments, the following values result 

7^2 = 1*829172 7?i = *0023706 

7^3 = *120452 7?2 = 2*548313 

fi^ = 8*52636 /c == - 007492 

which shows that Type II can be used. 

( 87 ) 




The equations for the type give 

m = 4-141766, a = 4-543079, =r 462-57 

The mean and mode coincide, because the curve is symmetrical 
For calculating a series of values, the following arrangement 
is convement ' 


X 

d 

r ^ 

log(l+i) 

log(l-^) 

(2) + (3) 

l0g2/:r 
= m X (4) 
-l-log2/o 

(1) 

(2) 

(3) 

(4) 

(5) 







It is easier to work in this way than by calculating values of 
1 — x^ja^. [n the particular example, ordinates were calculated 
at the beginning, middle, and end of each group, and Simpson’s 
quadrature formula was used tor finding the areas, viz. 


Group 

UngradufttcMl 

figures 

Areas 
Type n 

Mitl-ordiuates, 
7>po IL 

Areas, 

“Normal curve” 

0-4 

11 

14 

n 

22 

5-9 

110 

109 

104 

95 

10-14 

274 

280 

287 r 

270 

15-19 

451 

433 

440 

455 

20-24 

432 

433 

440 

455 

25-29 

207 

285 

287 

209 

30-34 

no 

109 

104 

95 

36, etc 

10 

14 

n 

22 


1,083 

1,083 


3,083 


A comparison of the mid-ordmates with the areas gives an 
idea of the error involved in using the former for the latter, 
the differences are largest at the ''tails” and near the 
mode. 


( 88 ) 



The curve starts at 19*992573 — 22*71540 = — 2*72283, and 
ends at 42 70797. 

The final column of the table gives a graduation by the 
''normal curve”. 


TYPE II 


I 






TRANSITION T^YPE (TYPE VII) 


2/ = ^o 



Origin at inode (= nioan) 



NOTES AND PROOF 


The curve may be taken as a special case qf Type IV when 
= 0 ^ or it can be evolved from Type II by making both m and 
o? negative in that type This happens whei:^ A > ^ The curve 
IS symmetricarand of unlimited range in both directions 

/* CO / ^2\ — m 

■"J 

then putting 1 + — = the reader will be able to show that 

iV = J ^Qa(l—z)~^ 

= I) 

or 2/0 value shown on the preceding page, because 

m = v^- 

EXAMPLE 

The following table gives the areas when jS2 = 5 and /^2 = ^ 
and shows a graduation by the ‘'normal curve ’’ The example, 
together with that of Type II, will act as a remmder that the 
“normal curve ” does not give entirely satisfactory results even 
with symmetrical distributions 


Type VII 
w = 4, a^~5 

Normal curve 
cr = l 

1 


1 


2 


4 


7 

1 

16 

5 

38 

24 

93 

93 

225 

278 

527 

656 

1,106 1 

1,210 

1,858 

1,746 

2,244 

1,974 

1,858 

1,746 

etc 

etc 


( 91 ) 




TRANSITION TYPE (TYPE III) 


2/ = 2/o 



Origin at mode 


7 = 


2//2 

Ai) 


A = 7«=^-- 

a = 

A3 ^Aa 


iV 

a 'ei^rip + l) 


Mode = Mean — ~~ 
2a2 


Tf expressing curve with origin at mean (see Table VI, facing 
p. 51): 

\r (jP d" ^ )^‘ 



NOTES 


The curve is usually bell-shaped, but becomes J -shaped 
when jp<0, that is, when > 4. The rang^^ is hmited in one 
direction only The criterion is that 2/?2 == ^ + Theoretically 
this gives K = CO but the curve may be used m many cases where 
K is not very large, provided 2^^ approximates to 6 + 
When /^3 IS positive y and a are positive, so that the range is 
hmited at a distance of a before the mode, when is negative 
y andct are negative, so that the range is limited at a distance a 
after the mode. If, however, > 4, then a and y have different 
signs 

EXAMPLE 

The following statistics are taken from a paper m the Trans, 
Actu Soc Edinb iv, 44, and give the numbers of wives 
tabulated for the ages of mothers, and according to years since 
marriage The mothers’ ages for the particular series are 30 to 
34 


Year after 
marriage 

Number of 
wives 

Graduated by 
Type III 
curve 

1 

44 

59 

2 

135 

111 

3 

45 

45 

4 

12 

20 

5 

8 

9 

6 

3 

4 

'A 

1 

2 

8 

3 

1 

Total 

251 

251 


The mean is *3346612 after the middle of the second group, 
and the moments about the centroid vertical are 1 441787, 
3 606622 and 18 93221, so that k = -8*44 
As this value was large, Type III was used, and 

y = *7995221 a = - *098007 

-*0783584 


( 93 ) 




This example is given because it can be used to show a 
difficulty rather clearly At first sight, a curve starting at zero, 
rising to a maifunum, and then falling, might be expected. 
Instead, we find the curve starting at duration 6 8 1 92 , * so that 



the first group is made up of a strip on a base -SISOS in length, 
and has a smaller value than the next group, though any 
ordinate read off within the first group would bo larger than any 
ordinate m the second group. No adjustment was made to the 
rough moments 

* Tho inode in ordinaiy cases oi Tyjio Itl is given by mean -T* In this 

case =1 25075, so the mode would be at 58391, and tho curve would start 
at {“mode” - a} == 58391 -f- 09801 == 68102 



PROOF 


( x\ 

1 + -1 put 

ya = p, and substitute 2; for y{a + x), then ^ if N be the total 
frequency, 


= J a~^ er^+v y-(p+i) for ^ = y 


dz 


= 2/0 


ypP 

ae^ 


z^ e~^dz 


This gives 2/0 


]SfpP+^ * 


ae^r{p 4- 1) 

The ^th moment about the start of the curve is 


1 /, xv 

+s) 


e-y^(x + a)^dz = 


2/oe^ 


p f*o 


Np^y' 

r(p + n + l) 
y'^r(p+l) 


zP+ne-^dz 


by using the value of 2/0 found above 


^1+ 1 


Since r(p) = (p — l) r(p — 1), the first moment is , the 
second and the third (^ + l)(i^ + 2)(i> + 3) 


7" 


yo 


order to apply these formulae to statistical work, it is necessary 
to have moments about the centroid vertical, the position of 
which (the mean) can be found, and as, by defimtion, the first 
moment about it is zero, we get 


/^2 






These results give 7 and p as 


and 


M3 

( 95 ) 


yo 

- 1 respectively. 

/^3 



TRANSITION CURVE (TYPE V) 


y = yo¥~^e.-yi^ 

Origin at start of curve 

. = 4+?-tMi±A} 
r = 

= i\p-i] 

y 

Origin = Moan —~_2~2 
2y 

Mode = Mean — , — 

2){p-2) 

The sign of y is the same as that of //g. 

If expressing curve with ongin at mean (s^e Table VI, facing 
p. 51) 

A = y/(j>-2) 

_ N(p — 2)^ 

y^- yet>-^ri3)-l) 



EXAMPLE 


The following series of deaths is taken from G-. Kin g's paper 
‘'On the rate of mortality amongst female nominees, etc.” 
(J. InsL Actu xxxm, 262-8). 


Ages 

Deaths 

Graduated by 
Type V 

30-34 

1 

1 

35-39 

5 

3 

40-44 

8 

6 

45-49 

12 

14 

50-54 

28 

32 

55-59 

82 

68 

60-64 

128 

137 

65-69 

253 

247 

70-74 

342 

381 

75-79 

525 

480 

80-84 

438 

441 

85-89 

265 

261 

90-94 

53 

80 

95-99 

18 

10 

100, etc 

4 

1 


2,162 

2,162 


The mean is at age 75*9782605, and the moments (adjusted), 
etc are 

3*573346 = *4950399 

= _ 4*752613 y?2 = 3-996134 

= 51 02583 K = *85 

Strictly speaking. Type IV should be used, but the value is not 
very far from umty, and the following Type V constants were 
found 

^ = 37 29145 

y = —390*6609 (negative, because is) 
log2/o= 56*930518 

( 97 ) 


EFC 


7 




The approximation to the value of log r{p — 1) was used The 
origin IS at age 131 32606, and the mode at 78*9467 

The columns used for calculating the ordinates were: 


a 

(1) 

to 

O — 

-p H 

(3) 

(4) 

log?/ 

=losyo + (3) + ('4) 
(5) 

y=antilog (6) 

(0) 




1 

" i 




Col. (4) is best formed by putting ylogj^^e on the plate of the 
arithmometer, multiplying it by Ifx, obtained from a table of 
reciprocals and reading oft the result negatively 

The point to be borne in mind in di'awing a curve of this type 
is that as the mode and origin arc not at the same place, care 
must be taken to give the maximum ordinate its right position 
and magnitude (cf Type IV). 

The graduated figures agree fairly closely with the original 
statistics below the 90-94 group, but are unsuitable for that 
and the two later groups, llio reason is that Type IV, having 
an unlimited range, should be used The pnr ticular case was 
chosen partly because an example in which //g is negative is 
rather more awkward than when //g is positive In such cases it 
18 a good check to imagine the statistics written in inverse 
order (in this case 4, 18, 53, etc.), and so avoid the negative 
signs 




PROOF 

Putting yjx ^ z m y = and integrating from 0 

to 00 , we have JSf = y^y^'~‘'^ r{p — 1) 

^yP-l 

Using the same substitution, the »th moment about the 
origin IS , Wo.. r“. 


= — n— 1) 

r(y-.-l) 

^ rfr-1) 


This gives 


{ 99 ) 


7-2 



which is the distance between the mean and origin, 


7^ 


(p-2)(^i-3) 


yd 


(i>-2)(i?-3)(i5-4) ^ 

Transferring the moments to the centroid vertical, 
y 3 


and 


P'Z ■ 


(p-2)^(j>-3) 

4yS " 

(i5-2)®(55-3)(j3-4) 

/t| 16(^1 -3) 16 


//,3 C-n — 


16 


and 


l4. J5-4 0-4)2 

, 16, ,, 16 


^ — 4 will have to be taken as the poKSitive root of the equation, 
or 7 , which from the above equations is given by 

0-2)V020-3)}, 

will be imaginary. 

Since the tangent to the curve at the top of the maximum 
ordinate is parallel to the axis of x, the position of the mode is 

I Z| IS zero 


such that dy/dx is zero there, i e y^x' 

X 0 and = oo give the cases in which the curve touches the 

y 

axis of X, and the other case, the one required, is when^ — ^ = 0, 

X 

y y 

or X — Le the mode is ~ from the origin 
p p 


Uncommon Frequency Types 

Up to the present we have dealt with common types of 
frequency-curves, but in the course of statistical work a dis- 
tribution is sometimes found which appears different in its 

( lOO ) 



algebraic form from the usual types, but can nevertheless be 
described accurately by those types. An example which will 
give an indication of the kmd of case we h^ve m mind is a 
distribution arising from recording the number of sequences in 
coin-tossmg or dice-throwing experiments : Ijhe distribution is a 
geometric progression and this, a well-known result in proba- 
bility, IS a special case of Type III if ^ = 0, for we then obtam 
the exponential which gives the series we want Certam 
limiting cases of Types I, II and VI give straight lines, curves 
startmg with an infimte and ending with a fimte ordmate, two 
separated blocks of frequency, and curves startmg at a finite 
ordinate and ending at zero either at a finite point or at m- 
fiLmty : among these last is, of course, the exponential to which 
we have already referred. 

Before turmng to the expressions for these new types it may 
be useful to give a table of various peculiar distributions that 
have been obtained from insurance and other material. 


Examples of uncommon Frequency Types 


469 

186 

166 

134 

122 

112 

45 

38 

46 

53 

43 

38 

49 

41 

44 

52 

119 

100 

86 

75 

61 

50 

39 

27 

22 

12 

3 

4,165 

2,028 

982 

480 

266 

132 

71 

36 

17 

9 

2 

1 

1 

1 

1 

33 

53 

65 

81 

101 

131 

186 

350 

68 

24 

17 

14 

12 

11 

10 

10 

10 

11 

12 

20 

1,189 

449 

594 

8,192 

1,000 

219 


The table includes (col. 6) areas of a U-shaped curve which 
is rare, in fact, I have not succeeded in finding a smtable 
distribution of this shape among actuarial statistics, but such a 
distribution might occur among terminations (including with- 

( loi ) 




drawals) m term policies of ten years, say, or similar endowment 
assurances. 

We may now 4©al with these cases, but we shall discuss them 
111 less detail than the more important types 

Type VIII 

Range from an infimte ordinate at — a to a finite ordinate, 
at 0. ^ 

m IS found from the solution of 

m3(4 - /?i) 4- - 1 2) ~ 24/?^^ +1 6/?i = 0 

and must be neither < 0 nor > 1 

a=:+(r(2 — m) 

' V 1-m 

Va = iV'(l-m)/a 

The distance of the mean from a? = —a is a(l — m)/(2 — m), 
and from cv = Ois —al{2 — m) When is positive a is negative 
If we use the form with origin at moan (see Table VI, facing 
p 51) 

A = a(l •~m)/(2-'m) and y^i = N{l—7n) (2 — — m)"^ 

The curve is a special case of Type I when is zero, that 
is, when 

r - 2 = r(r + 2) + 2)^ + 16(^+ 1)]} 

where r = 1)/(C + 2>?2) 

Thus the test for the suitability of the curve is that 

(¥2 ~ m (iQA - 12A - + 3)^ (b/?2 - 9A - 1 ^) 

(3/?i - 2/?2 + 6) {/?i(/?2 + + 4(4/?2 ~ 3/?i) (3/?i — 2/?2 + 6)} 

or A, say, is zero 

The criteria for Type VIII can be reduced to (1) special case 
of Type I, (2) A = 0, (3) 5y?2 — 9 is negative It may be 

added that 24^52 — 27/?! ““38 small; theoretically positive 

{ 102 ) 



If = 0 an interesting special case arises, m which, m = 0, 
and the curve becomes a horizontal line, which is also the hmit 
of Types IX and XII 

The solution of the cubic for m gives trouble m can also be 
found from m= — — 6/?3L““9)/(3y?i — 2/?2 4-6), and though 

this involves it should theoretically give' the same value of 
m as the cubic As the criterion is not exactly reached in 
practice the two results differ, and it seems preferable to find 
m from the cubic by using 

24^,- V{24Vf- 64A[m'(4-/gi) + ¥i- 12]} 

where wl is found from the expression in and given above 
or by some other trial method 
An alternative is to find from the criteria or from the 
diagram in Tables for StaUsUcians the value of P<^ which is the 
consequence of the particular value of p^ when a Type VIII 
curve occurs, and use this theoretical value m finding m mstead 
of the /?2 given by the actual statistics. 


Example 


Frequency 

Graduation (1) 

Graduation (2) 

469 

437 

436 

186 

222 

209 

166 

165 

161 

134 

136 

141 

122 

120 

127 

^112 

109 

115 

1,189 

1,189 

1,189 


The mean is -65518 of an interval after the centre of the 186 
group The constants were 


/^2= 2*986 *408 

/^3 = 3 295 /?2= 2 047 

18-252 A =--05 


( 103 ) 




6/^2 ~ “ 9 negative Hence T3rpe VIII can be used, 

m = -500 

, a= -5-797 

2/0 = 102-6 

The curve runs from -277 before the middl6 of the first to 
-02 after the end of the last group. The graduation is shown 
(No. 1) above. 

The areas can be calculated by the expression 


which gives the area of the remainder from — r to —a. In the 
particular case the range could be fixed at 6 as the data related 
to SIX months’ experience of maturities among endowment 
assurances, and remembermg that the mean is 


we found 


a(l— ot)/( 2 — m) 
m = -439 1/0=111-1 


The areas resulting are given in graduation (2). The foUowmg 
table gives the calculation of the areas in this case. The equa- 
tion to the curve is 


y 


Ul.l(l-I) 


439 


with range from 0 to 6, a being negative because is positive. 


a? 

(1) 

1 — a;/li 

(2) 

Colog (2) 

(3) 

(3) X in 

(4) 

(4)+luglU 1 
““logVa, 

(5) 

^ i-m 

(0) 

AiiUlog ((>) 

(T) 

Eemaiiidei 
^ of range 

(B) 

(7)x(8) 

(9) 

Aiea 

re- 

quired 

(10) 

0 







(> 

1,189 

115 

1 

8333 

0792 

0348 

2 0804 

2 3318 

214 7 

5 

1,074 

127 

2 

60b7 

1701 

0771 

2 1230 

2 3744 

236 8 

4 

947 

141 

3 

5000 

3010 

1323 

2 1779 

2 4293 

268 7 

3 

806 

161 

4 

3333 

•4815 

2118 

2 2574 

2 5088 

322 7 

2 

045 

209 

5 

1G67 

'7781 

3419 

2 3875 

2 6389 

435 5 

1 

436 

436 


Some of the columns can be dispensed with, they are shown 
in detail to make the method clear. 

Both graduations are reasonably close to the facts. 

( 104 ) 



An example of the hmiting case mil be fonnd in the following 
statistics: 


No 

Frequency 

Graduation 

Theoretical 

1 

45 

42 

45 

2 ^ 

38 

45 

45 

3 

46 

45 

45 

4 

53 

45 

45 

5 

43 

45 

45 

6 

38 

45 

45 

7 

49 

45 

45 

8 

41 

45 

45 

9 

44 

45 

45 

10 

52 ^ 

► 48 

45 


449 

450 

450 


The mean is *57 after the middle of the 5 group, the moments 

= 8 374 -oil 

= -026 = 1-78 

= 124*46 

The range is from *57 to 10 43 

The series was found by summmg in tens the last figure of 
Carlisle 3|- per cent. Table of and the mean should be at 5*5 
theoretically instead of 5*57 and y should be 45 The range 
should be *5 to 10*5. The example is interesting as showmg how 
the Pearson-curves graduate in an extreme case. The “gradua- 
tion” and theoretical results are shown. In the “graduation” 
decimals have been neglected. 

Type IX 

( /y.\ m 

^+a) 

Range from x = —a where 2 / = 0 to a; = 0 where y — 

«- ±<r(«.+2)y(^) 

( 105 ) 




m IS found by solving 

4) +m2(9/?i - 12) + 24m/?i + 16/?i = 0 
y == JV(m4- l)/a 


The distance of the mean from x == —a is a(m + 1 ) /(m + 2) and 
from = 0 is —a/{m + 2) r 

If we use the form with origin at mean (see Table VI, facing 

N(m + 


p. 51), A — {m + l)al{m + 2) and 


a{m-\~2)^ 


As in Type VIII, the value of m can be found by simphfymg 
the cubic into a quadratic, or by the other method indicated. 

The criteria are reached through the same equation as those 
for Type VIII, and can be reduced to ( 1) special case of Type I, 
(2) A = 0, (3) 5y?3 - — 9 is positive, (4) 2/?2 — is 

negative. 

If y?2 = ^ ^ A = curve becomes a sloping hne 



If /?! = 0 we reach a horizontal Jino as the limit, while if 
/?2 = ‘1 and /?i = 4, we have the other limit of Type IX, and 
find the exponential scries (Type X). 


Example 


Duration 

E'lcpoBod to 
risk in 
annuity 
oxponenc ‘0 

Typo IX 

Frequency 

lino 

0 

119 

118 

108 

1 

100 

98 

97 

2 

86 

85 

86 

3 

75 

74 

76 

4 

61 

63 

65 

5 

50 

52 

54 

6 

39 

41 

43 

7 

27 

30 

32 

8 

22 

20 

22 

9 

12 

11 

11 

10 

3 

2 

0 " 


594 

594 

594 


( io6 ) 




The mean is at 2-909 assuming the exposed to risk to be an 
ordinate at the duration or an area from to 

/ 42 = 6 27 -490 

/tg = 10-99 /?2 = 2*606 

/^4"=: 102-50 

6^2 ~ — 9 IS positive 2 y !?2 — 3y?i — 6 is negative 

The curve is not far from Type IX, and if had been *32 and 
/?2 had been 2 4, we should have reached a straight line 

with range from — 6 to 10 0 and obtained the graduation 
shown The whole area — -6 to + -5 is taken as the frequency 
for duration 0 Using Type IX, the following constants are 
reached 

1-123, a =-10 913, 115^64: 

The curve runs from —-586 to 10 3275 The 118 in the first 
group has been taken as the area from — -586 to + 5 Theo- 
retically there cannot be an exposure before duration — 5, but 
as we are merely giving an example of fitting a curve to a series 
of numbers this need not concern us The difficulty could be 
met by fitting a system of ordinates or by assuming a starting 
point for the curve. 

If m happens to be less than umty the shape of the curve is 

^ / it? \ 

somewhat different, e.g if y == lOOll-f — ) we have the 

following ordinates: 

100, 98, 95, 91, 88, 84, 79, 74, 67, 56, 0 

The actual deaths in a select mortality experience may take 
this form, but the shape of the curve will be less flat at the start, 
eg m the American Medico -Actuarial experieiice 1913 age 
group 30-34. 


( 107 ) 



TypeX 



Eange from 0 to oo. 

Distance of origin from the mean is tr. 
The ordinate at the mean {y^) m N Jeer 


The curve is a special case of Type III when ya == 0, that is 
when = 4 

The condition for Type III is given by 2/?^ = 6 + Hence 
the exponential form is given by = 4 and = 9. The curve 
IS also the limit of Types IX and XI. 


Example 



Frequency 

Graduation 

Theoretical 

1 

4,105 

4,132 

4,006 

2 

2,028 

2,016 

2,048 

3 

982 

1,015 

1,024 

512 

4 

480 

511 

5 

206 

257 

256 

6 

132 

130 

128 

7 

71 

65 

64 

8 

36 

33 

32 

9 1 

17 

17 

16 

10 

9 

8 

8 

11 

2 

4 

4 

12 

1 

2 

2 

13 

1 

X 

1 

14 

1 

1 

1 

15 

1 


r 


8,192 

8,102 

8,192 


The unadjusted mean is 2-0087. 

/^2 = 2*045 /?! = 4*629 

/^3 == 6 290 A = ^ 

= 39*720 cr = 1*43 

When the curve is an exponential the moments and mean 
require adjustment, but the Sheppard high contact adjust- 

( io8 ) 




ments are, of course, unsuitable. If the curve starts at the 
beginning of the first group, I think that the mean is overstated 
when /^3 is positive by 1/120* approximately,, and the second 
moment about the true mean is understated by approxi- 
mately.* Making use of the adjustments ^he mean is now 
1-934, /I 2 is 2 123, and o- is 1 457. 

The statistics relate to sequences in com-tossing and the 
theoretical figures are added In the statistics as pubhshed the 
sequences of 11, 12, etc. were 2, 1, 0, 1, 0, 2 Strictly speaking 
we are dealing with a system of ordinates, I made the calcula- 
tion as a series of areas m order to introduce the adjustment of 
moments. In calculating the graduated areas of the curve it is 
useful to remember that the area from a to 6 is {ya^Vh)^' 

It IS interesting to notice how the '"graduation” keeps 
closer to the frequency than the theoretical result. 

I give as a second example ,the following series based on 
cricket scores known to start at the beginning of the first 
gi'oup* 


Score 

0-19 

20- 

40- 

60- 

80- 

100- 

120- 

140- 

160- 

Senes 

64 

34 

18 

9 

6 

3 

3 

0 

0 

Graduation 

64 

34 

18 

10 

6 

3 

1 

1 

1 


The ratio of each term to the preceding is *54, and the gradua- 
tion is almost exact. Owing, however, to the 3 at the group 
120, the moments give a criterion considerably removed from 
the theoretical = 4, ^^2 = 

If we had assumed the start of the curve m the previous 
example, we should have reproduced the theoretical result 
almost exactly. 

* See, however, general discussion m Appendix I 


( 109 ) 




Type XI 

y = 2/0*='“ 

Range from x = h where y = to z = oo where y — 0 
m IS found from r 

m»(4-/?i) + m2(9y?i-12)-24;dim+16y?i = 0 

6 = + (T(m — 2) / r 

V m—1 ' 

Vo = Nb^-^(^n-l) 

The distance of mean from origin is 6(m — l)/(m — 2). 

If we use the form with origin at mean (see Table VI, facing 
p, 51), A = 6(m — l)/(m — 2) and 

_ N {m-2)^ 

As m Type VIII, m can bo found by simplifying the cubic 
into a quadratic or by the other method indicated. 

m may have any value from 5 to oo, but m practice its value 
IS not less than 9 

The curve is a special case of Type VI when — 0 
The criteria can be expressed as (1) special case of Type VI, 
(2) A = 0, (3) 2/4 — 3/?3^ — 6 IS positive 


Example 


Duration 

Withdrawals 

Giacluation 
by XI 

(H^ 

0 

165 

183 

1 

65 

53 9 

2 

23 

32 6 

3 

32 

20 0 

4 

13 

12 4 

5 

8 

7 6 

6 

1 

49 

7 

6 

3 1 

8 

3 

1 9 

9 

3 

12 

10 

1 

8 

11 

3 

1 6 


323 

323 


( no ) 




I have not come across a distribution really represented by 
this type, but I give an unsuccessful attempt to apply it to a 
series of withdrawals. The constants were 

/?i = 4*97 b = 57*14 

m 29 69 

Distance of mean from origin = 59*205 

In calculating areas we use - 1) as the area from 

a to 00 

Twisted ^~^aped Curve 

As pointed out in the notes on Type I (p. 59), we obtain an 
interesting curve when both and are numerically less 
than unity and one of them is negative. It arises when 

y52>l 5+1 125/?i 

and when l*25yffi 

as can be seen by remembermg that the sum of the values of the 
m’s must he between 1 and — 1 or r lies between 3 and 1 A 
special case has been discussed as a transition type (No XII) 
when 

** W{V(^ + A) ~ VA} ~ */ 

Range from x = o'(^(3+y!?i) — 7A) to a; = -o-(^(3+y5i) + 

The origin is at the mean. 

^0 “ br{m + l)r{\-m) 

where m = J and b = 2cr^{3+^i) 

When IS positive, the negative sign is taken for the square 
roots 

The hmit of the curve when = 0 is a horizontal hne. 
The criterion is 5/?2 — 9 = 0 


( III ) 



Example 


Frequency 

Graduation 

Ordinates 

r 

2 

18 5^ 

33 

31 

31 6 

40 6 

53 

49 

49 3 

57 0 

65 

65 

65 2 

73"4 

81 

83 

82 4 

r 

92 3 

101 

103 

103 3 

114 6 

131 

134 

132 9 

155 5 

186 

191 

186 0 

244 2 

350 

342 

405 4 

1,000 

1,000 



The mean is -051 after the centre of the 131 group. The 
constants are 

= 4'266 A = ‘761 

= -7-688 A= 2-646 

/t4= 48-154 5/?2-6;5 i-9 = --368 

?/ = 87 2{(6-8084-a:)/(2 204-*)}^® 

In addition to the graduation a number of equidistant 
ordinates is given. They show that the curyp rises abruptly, 
then less abruptly and then again more abruptly. The with- 
drawals in select tables are sometimes of this shape (e g 
Japanese experience, 1910, age 62, females). A somewhat 
similar twist occurs in a population curve 


\J -shaped Curve 

This shape arises in Type I when m^ and (or m m Type II) 
are negative There are difficulties in fitting it to statistics 
because it is awkward to adjust the rough moments The 

( II2 ) 




figures given in the table of examples (p. 101, col. 6) were 
found by calculating the areas of the curve 

The hmit of the U -curve is two separate Mocks of frequency 
at the ends of the range. This hmit is reached when 

1 = 0. 

Some of the curves with which we have dealt are rare and m 
practical curve-fitting may be avoided, for they depend on 
certain defimte values of A 23 chance of leaching 

these exact values is neghgible In other words, if the object of 
fittmg a curve in any particular case is to obtain the closest 
agreement between the actual figures given and the graduated 
figures, then the main Types (I, IV and VI) are all that are 
necessary, for the other types being transition types and 
depending on specific values of and need not arise If, 
however, our obj ect is to study pro babih ty in a wider sense, the 
transition types are of importance and they may, of course, be 
properly used when the values of the j3's only differ from those 
indicated by the criteria to a small extent. This ‘'small 
extent ” means (as we shall see later) witlnn the hunt suggested 
by the standard errors of the /?’s. 

ADDITIONAL EXAMPLES 

1 , Up to the jpresent we have merely considered examples 
with a view to illustrating the various types of frequency- 
curves, but it seems advisable to consider one or two practical 
examples which may help to show the range of apphcabihty 
of the curves in actuarial work, and give an opportumty of 
noticing a few difficulties which may arise in applying them. 

The function with which actuaries generally wish to deal m 
practical work is not an exposed to risk or series of deaths or 
withdrawals, but the ratio between the deaths and the ex- 
posed, that IS, with the rates of mortahty, sickness, marriage, 

( 113 ) 


EFC 


8 



and withdrawal. An actuary studying frequency-curves may 
therefore naturally ask whether any of these rates can be 
graduated by means of the curves we have examined, and, if 
they fail, must they be put aside for some other method ^ Now 
the first point to be considered is whether these rates are 
frequency distribLitions, if they are not, tlie^use of the fre- 
quency-curve IS empirical A rate of mortahty gives the pro- 
portion of people at each age who die, and if we imagine 1,000 
persons exposed to risk at each integral age, the number of 
deaths would be 1,000 times the rate of mortahty, and this 
seems to show that it is possible to consider the rate of 
mortahty as a distribution, though it is one that could hardly 
arise in actual experience It is impossible to describe the 
rates of mortality or sickness by a single frequency-curve. On 
the other hand, the rates of marriage are certainly much like 
frequency-curves, and the rates of withdrawal, whether re- 
garded according to age or duration, might take a form hke our 
example in Type 111 There are, however, practical objections 
to the direct operation on rates, even apart from the very 
exaggerated idea of frequency distributions in which it is 
necessary to indulge The numbers exposed to risk at the end 
of any table become small, and a single death or marriage there 
gives a very large rate, while at several ages near there may be 
a zero rate shown by the ungraduated data This is extremely 
awkward, as it tends to make the ratios dealt with far rougher 
in application than the actual observations are in fact, and we 
are forced to group the material before using it, which mtro- 
duces an arbitrary practice which it is well to avoid so far as 
possible. It must not, of course, be inferred that a small 
number of say fifty or one hundred deaths must necessarily be 
grouped according to each year of age, but that even if there 
are two or three thousand the roughnesses introduced by the 
use of rates influence the result considerably. Graduating rates 
means that an equal weight is given to each rate of mortahty 
which IS far from the weight indicated by the exposed to risk. 

2. It will be useful to consider a case bearing out these 

( 114 ) 



objections and then deal with a practical method of overconung 
them The statistics to be considered have been taken from a 
paper by M. Mackenzie Lees "'On Rates of Mortality and 
Marriage among daughters of Peers and HeirS Apparent, etc.” 
{Trans, Fac Actu, i, 276), and may be summarised as on p. 1 17. 

The momenl^s were calculated by the Sifrnmation Method, 
and were found, about the mean 28 77191, to be 

= 63-2092 J3^ = 1-557153 

/^3= - 627*101 = 4:-‘7SlZ21 

== 19,103-3 ^ 

The criterion was /c = — 1-5, but as I had neglected the rate 
•0089 at 71 m calculating the moments, I used Type III. The 
inclusion of the rate at that age would have lengthened the 
curve and considerably mcreased the arithmetical value of the 
criterion Moreover 2/?2 approximates to 6 + 

The constants for Type III were 

7= -201592 a= 7*78189 

p = 1-56881 Mode = 23 81128 

The curve starts, therefore, at age 16-02939. 

2/o == 890 05. 

The rates resulting from this graduation are given m the 
table, and while they tend to show that the distribution of rates 
of marriage is closely aUied to a frequency-curve, they do not 
give a satisfactory graduation, and the failure is due almost 
entirely to the objections mentioned above If we were exam- 
ming the algebraic form taken by rates of marriage, we should 
begin by work on population data where the roughness of 
material is avoided by the large numbers of individuals dealt 
with; as, however, we are seeking for a graduation, we must see 
how these objections, which of course apply to some extent to 
any method of graduation, can be overcome It has been 
remarked that the cause of the difficulty is that incorrect 
weights are given to the items used, and the most obvious 

( IIS ) 


8-2 



suggestion is that the actual exposed and marriages should be 
graduated separately. This, however, entails a large amount of 
additional work and seems to overlook the fact that deviations 
in the exposed to risk and the marriages are not independent. 
A shorter method can be used which avoids both the double 
graduation and th^ error ]ust indicated This method consists 
of using a series allied to the exposed, and treating it as a 
hypothetical exposed to risk from which a new series of 
marriages can be calculated The advantages are that we have 
only to make one graduation, and the weights of the various 
parts of the table are given approximately In a similar way 
can be graduated, and in this connection it may be remarked 
that as the exposed to risk is generally capable of being repre- 
sented by a frequency-curve, it is natural to suggest that the 
hypothetical exposed might be taken as the simplest form 
assumed by such curves (normal curve), this is also convenient 
because the ordinates lor such curves have been tabulated. 

3. The hypothetical exposed can be fixed by trial or from the 
values of the exposed The column in the table given on p 
117 IS taken from Sheppard’s Tables in Tables for Stahshemns, 
X being taken as 3 06, 3 084, 3-108, 3 132, etc , and the entries 
were multiplied by 10 ^». ill' = was then formed and 

graduated. The following values were obtained for the 

senes, ^ ^4 85779 =, 1-40775 

//2 == 29 5006 = 5-01114 

/ig = 190-112 a: =-7 102 

= 4,361-12 

As K IS large, Type III was used, and 

y = -310350 y„ = 192-625 

p = 1-841405 Mode = 21-63562 

a = 5 933325 

The curve was then worked out and the rates of marriage 
in the final column were obtained by dividing M' by E' They 
agree closely with the ungraduated figures 

( 116 ) 




4. A numerical example of the application of the method to 
the Table may now be given The normal curve with 

cr = 10 and origfn at age 52 was used, and the values were 
multiplied by with the help of Crelle’s Tables. 

A part of the yfoxk was ^ 


qxE xW 

Age 

Ordinate from 
Sheppard’s Tables 

Age 

qxE xW 

810 

52 

3984439 

(T 

53 

801 

597 

51 

3944793 

54 

850 

644 

50 

3866^81 

55 

875 


Summing these entries (qx Ex 10^) in fives, I formed the 
following 


Age 

qxE xW 

20 

13 

25 

70 

30 

218 

35 

594 

40 

1,394 

45 

2,460 

50 

3,702 

55 

4,519 

60 

4,385 

65 

3,602 

70 

2,249 

75 

1,197 

80 

461 

85 

133 

90 

31 

95 

5 

100 

1 


25,034 


The abbreviations (use of Oelle’s Tables and grouping) 
were adopted to save labour, and as the figures were required 
for an example they are sufficiently accurate 
The following values were then found 

Mean age = 59*439762 
/t2= 4*584327 
^^3 = -*4999871 
61*17014 

( ) 





Type of curve — ^No I. 
mi = 32-81166 

m 2 = 26-57123 

a^= 18 78553 

^2 = 15*21272 ^ 

2/0 = 4609-884 
Mode age = 59-730789 


(The unit is 5^years of age.) 

The ordinates were then calculated for every fifth age, and 
finding that the curve is not very far removed from the normal 
curve of error, I interpolated in the second differences of the 
logarithms of the ordinates for those at the other ages * A 
quadrature formula was used for finding areas, and was 
found by dividing by the hypothetical figure already used for 
the exposed. 

The expected deaths were as follows* 


Group 

Graduated 
for central age 
of group 

Actual 

Expected 

Deviation 

-t- 

- 

15-19 



15 

15 


20- 

00643 

9 

89 


1 

25- 

00731 

69 

610 


80 

30- 

00850 

205 

204 6 


4 

35- 

00991 

369 

380 7 

11 7 


40- 

01179 

588 

575 6 


12 4 

45- 

01452 

801 

8114 

10 4 


50- 

01866 

1,064 

1,063 8 


•2 

55- 

02505 

1,399 

1,386 6 


12 4 

60- 

03516 

1,752 

1,773 2 

212 


65- 

05118 

2,164 

2,136 7 


27 3 

70- 

07682 

2,216 

2,261 2 

45 2 


75- 

11648 

1,965 

1,925 8 


39 2 

80- 

17462 

1,237 

1,241 9 

49 


85- 

24870 

494 

514 4 

20 4 


90- 

33286 

129 

126 0 


30 

95- 

43289 

18 

17 3 


7 

100- 


1 

15 

5 

• 



14,480 

14,492-1 

115 8 

103 7 





219 5 


* As IS the equation to normal curve, the logarithm is Ax^ +Bx-^G^ 

say The criterion shows if the curve is nearly normal. 


( 119 ) 




5 . It will be interesting to examine a particular case of the 
method just described, as it is often required by actuaries 
Defining Mateham’s hypothesis as colog A + we 
take a normal curve to represent the exposed and 

multiply by the values of colog This means^that we assume 
that the products can be represented by 

= J.2/oe-^'®-~W2a'2 j;7,+^2 log^ c] x+[h+cT^ log^ cf)l2<T^ 

where H ~ e{^i^+2cr2;iiogec+o^(ioggc)2-~A2}/2(r2 _ ^niog^c^-^iXo^Qcf 

2 / = (I) 


i e the sum of two normal curves both having the same 
standard deviation as the exposed curve and one having the 
same origin 

The difference between the two means gives cr^loggC, so 
logioC = “ 2 - logxoe 

The whole solution is made very simple by taking moments 

^■j-oo r+oo 

about the known origin (age h), for xydx and x^ydx 

J —00 J —00 

(the first two moments) give 

{t’-h)N2^ and iV'3^cr2 + i\r2{(j2 + (^-.^)2}| 


where = AyQ(T^J{27r) and = HBy^cr ^{27 t) 

Dividing the values just given by + (the total fre- 
quency), we obtain, as the first moment about the known 

origin, and, as the second, 

jy "T" -^v 2 


N^(r^ + N^(r^ + N^{t-h)^ 


cr2+ (< — A)/<i 


* Remember that the normal cuive is symmetrical, so that the odd moments 
about the mean of such a curve are zero 
f Can be seen at once as the sum of two integrals, N^cr^ gives the second 
moment of the first normal curve in (I), and N ^{cr^ (t gives the second 
moment of the second normal curve 

( 120 ) 



or 


t — h = 




and 


ly _ 1 + -^2) 

^ t-h 


where fi' is written for moments about h 
As stated above 


and if Uq = 


and 


IQfe 

cr^j{2n] 


, t-h 

logiflC = -^ logic e 


as IS generally convenient, then 


(Ih) 


A = iVi/10*= 

B = N^I{10^xH) 
N, 1 


10 * 


gMogjC+y (logtC)* 


A, 


(t-h) . 


(see equation (II)) 


N, 


lO^c 2 


Care is necessary with regard to the value used for y^, and 
consequently with regard to A and B If Sheppard’s tables in 
Tables for StatisUcians of ordinates {z) be multiphed by, say, 
10^ and used as^ the exposed to risk, the values of A and B 
resulting from the work will be and N 2 I {10^ Ho') 

The reason is that his tables are in terms of standard deviation 
6. If we assume, as Hardy did when graduatmg the British 
Offices 1863-1893 experience, that logj^c is known, we only 


require to calculate one moment which gives us 


jt-h) N2 


, and 


this, with the help of equation (II), enables us to complete the 
solution If c were obtained for the aggregate table, we should 
use this result for the select tables. 


( 121 ) 



7. A numerical example with the Table may be of 

interest. A normal curve with standard deviation 10 and 
origin 55| was taken, and the terms multiphed by colog 
These were then grouped in fives, and the first two moments 
calculated about e^e 55}. One little point should be borne in 
mind m connection with the grouping, though the centre of the 
base on which the product (q^ x exposed) stands is a: -f the 
result (colog exposed) is an ordinate at x, the centre 
po&it of five ages 20 to 24 is 22} when is u^ed and 22 when 
colog is used ^ 

The figures were 

+ = 136387 

1st moment about 55} m 5'years unit = 1-416184 

2nd „ „ „ = 4-1929354 

Deducting Sheppard’s adjustment of from the second 
moment* and multiplying the first moment and the adjusted 
second moment by 5 and 25 respectively to make the umt one 
year instead of 5 years, we have 

= 7-080920 

ju} = 164 384085 

then log(^ — ^)= -9586889 

9-092617 

logioC = -03948873 

A = -00301749 # 

B - -00004518782 

log^^B = 5 6550214 

q^ was then calculated from the graduated colog obtained 

from the values of A, jB and c, and the following table of 

* As we are dealing with the sum of five ordinates in each group and not 
with an area, we should not, strictly speaking, use Sheppard’s adjustment, but 
should deduct 08 The difference is small and the constants have not been re- 
calculated The formulae would be /i 2 =P 2 ~ and 48^2+ 02752 where 

/I IS adjusted and v unadjusted 


( 122 ) 



expected deaths was worked out. The values of are given in 
the table showing the frequency-curve graduation* 


Age 

group 

Graduated 
for central age 
of^roup 

Expected 

deaths 

Actual 

deaths 

Deviation 

» + 

- 

Under 25 


13 0 

9 

40 


25-29 

00812 

67 0 

69 


20 

30- 

00882 

211 6 

205 

66 


35- 

00991 

380 8 

369 

118 


40- 

01162 

566 9 

588 


2i*r 

4 : 5 - 

01431 

799 7 

801 


1-3 

50- 

01854 

1,057 5 

1,064 


65 

55- 

02517 

1,392 7 

1,399 


63 

60- 

03551 

1,790 2 

1,752 

38 2 


65- 

05160 

2,153 0 

2,164 


110 

70- 

07639 

2,249 3 

2,216 

33 3 


75- 

11415 

1,888 7 

1,965 


76 3 

80- 

17053 

1,213 6 

1,237 


23 4 

85- 

23352 

519 1 

494 

25 1 


90- 

36484 

136 6 

129 

76 


95- 


20 6 

19 

16 




14,460 3 

14,480 

128 2 

147 9 





276 1 


This result is very like that given by the late Sir G. F. Hardy, 
but avoids having to obtain c by trial Hardy’s expected and 
actual deaths balance better than the above, but I do not think 
the rates have been understated systematically, the 75-79 
group accounts for the disagreement The total deviation is less 
than Hardy’s 

8. Another possible application of frequency-curves to hfe 
assurance and n?ortahty statistics was discussed recently. The 
exposed to risk or the amount of the sums assured or premiums 
at each age can usually be graduated by a frequency-curve. 
When an actuary values the Habihties of an insurance company 
he works, in effect, on the proportion of the business that 
survives to each age in successive years according to the 
mortahty table assumed in the valuation. If the proportion, 
at age x, that survives n years by a given table of mortahty is 
and if is the amount of sums assured, say, on the books 
at age x, then the amount of sums assured surviving after n 

( 123 ) 




years is For diverse mortality tables, various values 

of 71 and a fairly wide range of frequency-curves assumed for 
E^ , we again re^ch a frequency-curve as an approximation to 
the distribution of E^ in terms of x Several statistical 
examples have b§en given"^ and the reader who wishes to 
examine other examples than those given in this book may 
refer to them or to such a large collection of examples as those 
given by K Pearson and A. Lee for Barometric Heights | 

9. If we know the range of a curve we ne^d not even with 
Type I find as many as four ipoments, for the equations on 
p. 64, giving the moments about the start of the curve, afford a 
simple solution. We have 


/4 


b{m^+l) 

/W. A'.O .1 O 


and 


-h 1) (mt -1-2) 


and writing 


we have 

and 


7i 




and 72 


/^2 

/i[b 


mjL+ 1 


y](y2-i) 

yi-ys 


mo + 1 


(y2-i)(i-yi) 

yi-y 2 


where /^' is written for a moment about the start of the 
curve. 

10. If, however, we can only fix by general considerations 
the start of the curve, the following solution depending on 
three moments is of use. 


Writing 



and A 3 




the values of the constants m the equation to the curve are 
given by 

^('^2“ -^3) 


m^-l - 1 = 


2A0 Aft AoA' 


2^^3 


•j Philos Trans A, cxo, 423 

( 124 ) 


* J Inst Actu LXV, 1, 



and 


mo + 1 = - ^3) (^~^2) 

(2 A3 — A2 — A2A3) ( 1 -}- Ag — 2A2) 

mi + l 

® i/'^2 — 

1 1 . We may return to Type I for an example of tlie method, 
of § 9 where we will assume that the curve starts at age 17-5 and 
has a range of 1 5 5 umts Considering the line for age 22 m the 
table on p 60 we see that 4-i;^5 and 14 634 give and S 3 , 
excludmg the first group, and the moments about age 17 are 
then found to be 4-175 and 25 093, transferring to 17-5, we 
have 4 075 and 24 268, adding the moments for the first group, 
•034 X i- and -034 x (-j-p respectively, = 4 0818 and 

II 3 = 24-26936 

Hence m-y = -3498 = 1 735 

m3 = 2 7758 = 13 765 

yo = 154 2 

and the mode is 17 5 + 1 735 x 5 = 26 175 

From these values the graduated figures for the first four 
groups are 37, 140, 152, 143 

12. It may be of help to give another example of a J -shaped 
curve and we take the first example of Table I for which 
the mean is at duration 4 182, and the moments and 
constants are * 


II 3 = 17 63688 

A = 

3 34846 

II 3 = 135 5361 


6 18392 

fly = 1923 565 

K = 

-1 307 


so that the curve wiU be of Type I and equation to it is 
y = -89082:1:- 629685 (25 49729 -a:)i 624275 
the ongm bemg at 1-02897 where the curve starts. 

( 125 ) 



The graduation by this curve is shown m the following table 


Duration 

Withdrawals 

Graduated by 
Type I curve 

1 

308 

312 

2 

200 

198 

or 

118 

101 

4 

69 

76^ 

5 

59 

58 

6 

44 

45 

7 

29 

37 

8 

28 

30 

9 

26 

25 

10 

ii 

21 

11 

18 

18 

12 

18 

15 

13 

12 

13 

14 

11 

11 

15 

5' 

9 

16 

11 

7 

17 

7 

6 

18 

6 

5 

19 

1 

4 

20 

3 

3 

21 

1 

2 

22 

3 

2 

23 

2 

1 

24 


1 


1,000 

1,000 


13. The calculation of the graduated area of the first group 
may present a difficulty, as a quadrature formula cannot be 
apphed, and the following method gives the best way of 
obtaimng a correct value 

( yQX'"^(b — x)”'^dx 

Jo r \ 

, , / 1 m^x \ 

- ‘“■(srn"655^ + 2)+ ■■■) 

which IS a rapidly convergent series when x is small In the 
preceding example, where x is 1-5 — 1-02897 = 47103, the 

( 126 ) 




second term barely affects the result, must be calculated 
by the formula 

iv r{T) 

^Wi+77i2+i r{m^ + 1) r{m^ + 1) • 

The expression for finding the area of the first group in 
Type III curv^ is 

...j 

• • 

where y'^ = -ZVy2’+i//'{p + 1) 



CHAPTER VI 


r 

COMPARISON OP VARIOUS SYSTEMS 
r OP CURVES 

1 . In the previous chapter we dealt with Pearson’s system 
of frequency-curves, but other methods have been used to 
describe frequency distributions We have aj[ready seen that 
Pearson’s system of curves describes the facts that have been 
collected about a variety of srfojects connected with chance. 
A system is useless if it does not give approximately the 
distributions that actually occur The binomial series is justi- 
fied from this point of view as a description of the number of 
times events happen, because we have found from experience 
that the numbers given by it are realised approximately by 
trial When we consider the matter we are almost compelled 
to admit that the real justifi.ca"tion of any theory of probabihty 
IS that events happen in the way such a theory leads one to 
expect, and if we wish to compare the systems of frequency- 
curves that have been suggested in recent years, it should be 
done not so much by examining the ways in which they have 
been derived as by seeing what classes of distribution they 
represent and by noticing carefully the cases of failure and the 
difficulties of application 

2. As we know from experience that the binomial series 
actually represents a simple type of probability j it is natural to 
start from it and treat it, or its limits, as a part of any system, 
it must, in fact, be a special case of any more general type that 
may be evolved. 

We can proceed either by building up a curve on assump- 
tions which it seems natural to adopt or by taking a more 
complex series than the binomial (e.g the hypergeometncal) 
and in either case an expression might be reached having 
greater generahty than the binomial. But it must be remem- 
bered that the ultimate justification of any evolved formula 

( 128 ) 



rests mainly on its breadth of application to statistics which 
may reasonably be described as chance distributions. Such 
apphcation is an important test of the fundamental assump- 
tions that were adopted in reaching the formula, for it must be 
admitted that the plausibihty of the mitial statements would 
be poor defence of a curve which broke down whenever it was 
put to a practical test 

The well-known ''normal curve of error”, with which we 
dealt on p 80,^ was a first step towards findmg a simple 
frequency-curve, but though it works well as a descnption of 
the binomial {p -{- when p Is approximately equal to q or 
when n is large, it is unsatisfactory m other cases In actuarial 
work these eases frequently arise At the ages attained by the 
majority of lives assured in any assurance office the rate of 
mortahty or probabfiity of a person d3ang in a year is small 
and the frequency distribution giving the number of deaths 
happemng in successive years out of 50 cases, say, when 
q — 02 and^ = -98, would not be satisfactorily described by 
the normal curve of error It is true in a sense that the 
"normal curve ” is a law of great numbers, but if it can only 
deal with cases resting on such a basis it cannot have a large 
sphere of action in practical statistics and it can hardly be 
expected to be of value when a series is more like the hyper- 
geometrical than the binomial. 

3 - It IS this failure of the ' ' normal curve ’ ’ that has led to the 
work of Pearson, Thiele, Charher, Edgeworth, Bruns, Kaptejm 
and others, and the curves suggested by these writers are of 
considerable mterest to all students of statistical mathematics 
In this chapter we shall indicate how far some of these curves 
fit the statistics that arise in practice, how far, m fact, they 
graduate the rough figures obtained from the collected facts, 
and where they break down. 

Before proceeding, however, it will be necessary to discuss 
briefiy the suggested types. We may also mention an old 
difficulty in practical work of this nature, namely, that 
statistics are seldom obtamed from strictly homogeneous 
material. This fact must be taken as one of the typical elements 

( 129 ) 


rFC 


9 



in practice, and if a series can graduate in spite of a small 
amount of heterogeneity it is, from some points of view, all the 
more valuable in much of the work that comes to the hands of 

r 

an actuary or statistician. 

4. We may now turn to an expression which we will call 
Type A, namely r 


F{x) = 




1 

where 9^o(®) = 

So that, if 0 “ = 1, i.e if we measure m terms of the standard 
deviation, 

(l)^{x) = {Zx — x^)(^^{x) 

(hix) = (a7^“"6a;2 + 3)^o(a;) 
cl)^(x) = ( — 15r^) <^o(a;) 


In applying these expressions x is measured throughout from 
the mean in terms of the standard deviation the measures 
used m Tables for StaUsUctans It may be mentioned that the 
coefficients m round brackets m the equation for Type A as 
set out above are the third, fourth and fifth semi-mvariants 
In Tables for 8taUsUc%ans (Pt n, Tables v-vii) 




i-ir 


d”' 

dh'^ 


w 277 / 


and when using these tables we write F as ^ 


-{Ti(/i) + -81649658 VjSi.TjlA.) + 45643546(/S2 - 3 )t5(/i) + . . } 


This series has been discussed by many writers, especially on 
the Continent*, and it may be regarded as the use of the 

^ Gram, Thiele, Charlier, Bruns, etc In a memoir entitled Researches into the 
Theory of Prohahility (Meddelanden Lunds Astronomiska Observatorium, 1906), 
0 V L Charher gives several numerical examples and many useful notes 
J P Gram, on p 94 of Om Rceklendnklingei , bestemte ved mindste Kvadraters 
Methode (Copenhagen, 1879), says that Oppermann had suggested the formula 
some time before 


( 130 ) 



‘'normal curve” as a generating function. It has, naturally, 
a greater range of applicabihty than the “normal curve”, 
but it IS not of service m the more extremely skew cases, and 
it has been suggested by C V L Charher that, in such 
circumstances, an expression Type B should be used This is 
F^x) = BQi/f(x) + + . 


where xlrix) = c" 


^sniTro^ri 

7T \jc 1 ' 


m 


: + 




{x-l) 2»(r-2) 



and = 'ijf{x) — ijr{x~l), le A}p'{x—1) and values of ^(r) 
for x<0 are assumed to be zerx) Similarly = Aijr^zly 

In the hmit when m is an mteger ifs[x) becomes e~^m^lxK 
This expression is already weU known m the theory of pro- 
babihty as Poisson’s series — the “normal curve” is sometimes 
spoken of as a “law of great numbers” and the Poisson senes 
as a “law of small numbers” Type B uses as a 

generating function similarly to the way in which Type A uses 
the “normal curve” 

5. The fitting of Type B presents certam special difficulties 
as alternative methods are available, but we may as a preface 
to them point out that if we fit e~^m^/x ^ by moments using all 
integral values of x from a; = 0 to rr = oo we obtain 


= m /^3 = m /^4 = + m 


or /?! = ~ 

This, however, assumes a system of ordinates, umt distance 
apart, and we know that in practical statistical work these 
assumptions hmi1& us unduly 
We can, however, write 


F(xw + c) = Bof{x) + BJ;l/r^ly + + . 


which imphes that owing to w we have generahzed the umt of 
grouping and owing to c the point from which x is reckoned is 
also generahsed 

In this form Charher suggests four methods of fitting and 
remarks that the series usually becomes more convergent if we 
arrange constants so that — — 0. 


( 131 ) 


9-2 



(1) Assume w — I and c = 0, that is, revert to the original 
form and choose m so that vamshes, and since Bq = N we 
can reach 

3 ' JS3 = 'xV( - + 3//2 - 2&) 

4 ^ B^ = — 6/1^ — 6bjLC2 + 11/^2 + 36 ^ — 66 ) 

where 6 is the distance from the origin to the mean This 
method can he used when we can anticipate that m will not 
differ greatly from 6 c 

(2) Assume w = I and calculate c as an unknown constant, 
choosing it and m so that B^ and B 2 vanish 

c = 6-/^2 3»£3 = N{j[i2-/h) 

m — /( 2 4 :^ B ^ = — 3/4 — 6/^3 + 6/^2) 

(3) Find m, w and c so that = jBg = -Bg = 0 


io = Bq — NJw 



c = 6-/4/A.3 


This method usually gives w very small values and m very 
large values when jti^ vanishes, so it is only applicable in 
markedly skew cases 

(4) Fix c arbitrarily and find m and w so that B^^ ^ B 2 = 0 
m = (6 — c ) 2//^2 ^ 

IV = /ijib^-c) 

B^^ = N/w 

2v^3^Bs = Bo(w/C2-^>s) 

It seems unnecessary to give the work m detail leading up to 
the various sets of equations. Tables of m^jx > will be found 
m Tables for Statisticians 


( 13^ ) 



6 . P. Y Edgeworth^ has used a series similar to Type A, 
namely 


3 Ada:/ ^4t W 


<I>q{x) 


where k^, etc are the third, fourth, etc semi-mvariants 
Expandmg the exponential, we reach 

^{ti(/0 + 81649658 #iT4(A) + •98601330j8it,(;^) + .. 

+ 45643546(j32 - 3) r^ih) + + etc } 

Arithmetically the difEerence between this series and Type A 
is usually small Type A does not mclude the term which 
1 1 1 ^6 

arises from 777 wr xi Later terms would also differ, but 
2 I 3 f 3 ^ dx^ 

the expansion shown assumes that we shall not use more 
than four moments and that T 9 etc terms can be ignored, 

7. It IS possible to use other expressions, e g Type III, 
instead of a normal curve as a generating function A necessary 
condition for a frequency function is that it must not produce 
negative frequencies and the reader who wishes to pursue this 
part of the subject may be referred to a lecture by Professor 
Steffensen giving an mterestmg account from first principles j 
Por a general discussion of Edgeworth’s and the A series and 
the theory underlying them the reader should study the 
papers to which reference has already been made and also 
Professor H Cramer’s paper “On the composition of ele- 
mentary errors ’4 m Skandinavisk Aktuanehdskrift, 1928, 
p. 13 etc and p 141 etc 

It IS not, however, pretended that the curves and series set 
out above exhaust the suggestions that have been made, but 
they may be taken to represent the methods that have 

Edgeworth contended that his equation was umque in its character and 
theoretical basis It avoids the negative frequencies which may arise with 
Type A and are unjustifiable m theory This last point will be brought out mthe 
numerical examples Trans Gamh Phil Soc 1905 (Law of Error), J Poy Statist 
SoG 1906 (Generahsed Law of Error) 

I J E Steffensen, Some recent researches in the Theory of Statistics and Actuaria 
Science (Cambridge University Press, 1930, Third Lecoure) 

( 133 ) 



received most general support, and the examples we shall give 
do not go beyond them We may, however, mention that it has 
been suggested that graduations should be made by vmtmg 
y = This way of using the ''normal curve’’ has been 

called the "Methf>d of Translation” and m its most general 
form IS arbitrary In practice the form off{x) must be restricted 
and certain special cases have been studied but the method 
seems to be open both to practical and theoretical ob]ections, 
and it will not be discussed m detail 

8. Numerical Examples 

Example I 


(Symmetrical curve not capable of satisfactory gi actuation 
by the normal curve of erroi ) 


Ol)sei\ationH 

Peai son’s 
Type II 

Type A 

Edgewoith 

Normal curve 

11 

14 

15 

^ 1 . 

20 

116 

109 

106 

106 

95 

271 

286 

284 

285 

270 

451 1 

133 

437 

436 

456 

432 

433 

437 

436 

450 

267 

285 

283 

284 

270 

116 

109 

106 

106 

95 

16 

14 

15 

16 

20 


In this case all the curves except the normal give excellent 
graduations We have not used Type B because Charher 
apparently only adopts it when Type A is unsuccessful He 
does not give a statistical criterion to show when A or B should 
be used and it is difficult to see how such criterion can be 
evolved The solution of his Type A does not lead to imaginary 
quantities when Type B should have been used, in the way that 
Pearson’s Type I, for example, does when it is inapplicable In 
reaching Type A and the Edgeworth graduation we have 
used the terms involving and respectively is used 
here for the coefficient of (j>j^{x) Similarly hereafter with A^^ 
Notice that A^ involves but may also involve other 

* See Edgeworth, J Roy Statist Soc vol lxi, Kapteyn, Slew Fieqmncy 
Curves in Biology and Statistics (Groningen, 1903), or Bowley, F Y, Edgeworth's 
Contributions to Mathematical Statistics, 1928 

( 134 ) 




Example II 


(A distiibution whicli is not markedly skew) 


Observations 

Pearson’s 
Type III 

Type A 

% 

Edgeworth 

3 

4 

5 

4 

20 ’ 

17 

22 

17 

38 

42 

47 

42 

63 

59 

60 

59 

ol 

53 

50 

53 

29 

[ 33 

27 

32 

21 

! 15 

13 

15 

4 

{ 5 

4 1 

6 

0 


1 

2 

1 

04 


1 


In each case three moments have been used The observa- 
tions and Edgeworth’s graduation are taken from Edgeworth’s 
paper, ''The generahsed law of error” Type A is the least 
successful 


Example III 

(A distinctly skew distribution) 


Observations 

Peai son’s 
Typel 

Type A 

Type B 

Edgeworth. 



- 2 

1 


1 



8 


9 


2 

25 

12 

30 

64 

67 

53 

64 

64 

116 

116 

90 

104 

102 

140 

138 

125 

129 

130 

145 

139 

145 

134 

135 

134 

128 

143 

128 

130 

106 

^110 i 

123 

116 

111 

82 

89 ! 

93 

93 

92 

72 

69 

65 

73 

73 

49 

51 

44 

53 

53 

37 

35 

31 

36 

36 

25 

24 

23 

25 

20 

13 

15 

16 

14 

10 

10 

9 

10 

10 

4 

5 

5 

5 

5 


2 

2 

t 2 

2 


04 

1 

1 

1 



Pearson’s figures come from his Chances of Death^ and 

* Chances of Death, i, 74 (London, 1897) 

( 135 ) 





Edgeworth’s from his ''Generahsed law of error”. Each of 
these graduations was obtained with four moments Clearly 
Pearson’s Type I^is the best and Type B the next best gradua- 
tion We do not think Charlier would use Type A in such a case 
In fitting his Type B there are, however, many difficulties 
owing to the fact that he gives us four approximate methods of 
apphcation, this is an objection which may be sm^mounted m 
the future, but makes Type B awkward at present The other 
points to be noticed in these graduations are the negative 
frequency in Type A and the 40 cases in Edgeworth’s gradua- 
tion which have no case corresponding to them in the data 
Edgeworth, however, has remarked that he only aims at the 
mam body of the curve and does not much concern himself 
with the tails, but one cannot help feeling that the main body 
must be understated if one tail possesses an excess of 40 out of 
1,000 cases and the other tail is in defect by only 20 


Example IV 

(J -shaped curve) 


Obsoivations 

Pearson 

T\pc B 

m 

136 9 

134 9 

55 

48 5 

51 6 

23 

22 6 

22 5 

7 

96 

9 5 

2 

3 4 

29 

2 

8 

6 


The Type B curve is given by Charlier m Researches into 
the Theory of Probability The Type B curve gives a slightly 
better graduation, but the agreement is close m both cases 
The example is not conclusive as to J -shaped curves, but shows 
that Type B can graduate them successfully The particular 
example has only six groups, and with a curve of something 
like the right shape and three constants we are likely to reach 
close agreement Edgeworth’s curve is unsuitable A gradua- 
tion by Type A has been given elsewhere, but though it 
apparently graduates the figures the curve is not J -shaped 

( 136 ) 




Example V 

(Senes which is nearly symmetiical) 


t Observations 

1 

Pearson’s 
Type IV 

Type A 

Edgeworth 

10 

6 

4 

3 

13 ^ 

16 

14 

10 

i 41 

49 

46 

34 

. 115 

135 

126 

110 

1 326 

321 

306 

298 

' 675 

653 

637 

662 

! 1,113 

1,108 

1,108 

1,164 

i 1,528 

1,535 

1,563 

1,603 

1,692 

1»712 

1,753 

1,747 

; 1,530 

1,522 

1,548 

1,510 

I 1,122 

1,074 

1,075 

1,024 

611 

604 

589 

571 

1 255 

274 

256 

263 

i 86 

102 

92 

104 

26 

32 

29 

37 

8 

8 

7 

12 

t 2 

2 

2 

2 

! 1 

1 

1 

1 

i 1 

\ 





These graduations give similar results and need no comment 


Example VI 

(Distribution having two maxima) 


Data 

Peai son’s 

Type II 

Type A 

Edgeworth 

10 

3 

26 

4 

78 

96 

74 

34 

193 

191 

156 

135 

286 

261 

262 

270 

! 303 

304 

354 

363 

1 291 

319 

390 

390 

1 303 

304 

354 1 

363 

1 286 

261 

262 

270 

1 193 

191 

156 

135 

' 78 

96 

74 

34 

' 10 

3 

26 

4 


This IS an imaginary example giving a double-humped 
distribution It was formed from Type A by putting Ag = 0 
and A4 = 09, the series being 

_19, _53^_76, -1-103, +783, +1929, +2855, etc. 
Negative frequencies, which are meamngless, were discarded 

( 137 ) 



and the data cut down and graduated The interesting feature 
IS that Type A from which the data were formed gives a poor 
agreement This^is due to the negative frequencies and the 
integration for moments from — co to +oo Negative fre- 
quencies are somewhat objectionable in themselves, they are 
still more objectionable when they influence 'curve fitting to 
the large extent show n in this example 

Example VII 

We have remarked that the^e is a difficulty in choosing a 
solution to Type B, but its graduating power compared with 
other formulae can be indicated by setting out a few examples 
of the forms taken by from Tables for Statisticians 

For comparison I have added examples of Pearson’s Type 
III, though it must not be supposed that either set is meant to 
give the closest agreement with the other that it would be 
possible to make, they have merely been taken to give an idea 
of the range of application By bringing in terms involving 
we can increase the range of Type B and by using the whole of 
Pearson’s system we cover a wider range than that of his 
Type III 


Type B 

Peabson’s Type III 

I 

\l 

III 

T 

IT 

III 

368 

111 

45 

387 

63 

31 

368 

244 

140 

386 

279 

149 

184 

208 

217 

160 

285 

230 

61 

197 

224 

47 

489 

218 

15 

108 

173 

15 

102 

160 

3 

48 

107 

4 

49 

101 

1 

18 

55 

1 

21 

56 


6 

25 


9 

29 


2 

10 


! 3 

14 



3 


1 

6 



1 



3 





i 

1 

1 


9. The few examples we have given will be of help in bringing 
out the comparison of the types of curves with which we have 
been deahng ^ 


( 138 ) 




The Pearson-type curves will graduate satisfactorily all the 
examples we have taken, but cannot reproduce the double 
hiimp of our imaginary data (Example VI). They will graduate 
symmetrical, shghtly skew and very skew distributions and 
also J and U-shaped distributions They h^ave been fitted m 
various circum’^tances and are satisfactory from the point of 
view of agreement The arithmetic mvolved is, however, 
very heavy, but the curves are the most useful of those 
now considered ^ 

Type A gives numerically the least work, but it does not 
graduate satisfactorily very skew or J and U-shaped distribu- 
tions and it has therefore a smaller vogue If, however, it is 
combined with Type B as Charher suggests, J -shaped and 
skew distributions can be graduated We have found some 
difficulty in applying Type B, for Charlier does not give much 
help in deciding which of his four methods of fitting should be 
followed in a particular case, and we feel that the graduation 
capacity of this type may be greater than our trials with it 
3 ustify us in thinl^mg at present It would clearly be impossible 
to improve on its graduation in Example IV, but Example III 
and two examples given by Charher in his Researches into the 
Theory of Probability are less fortunate 

Edgeworth’s curve can, roughly speaking, graduate the 
same distributions as Type A 

10 . We may now refer to two difficulties m connection with 
Edgeworth’s curve and with Type A respectively which have 
already been motioned In Example III we found that 40 
out of 1,000 in Edgeworth’s graduation have no observations 
corresponding to them and we remarked that it seemed a large 
excess , the reproduction of the exact number of observations is 
not only a practical necessity, but is assumed by the method of 
moments If, therefore, a large number of cases falls outside 
the observations, we must either say that the total frequency is 
not reproduced or that the frequencies are misplaced, in either 
case the mam body must be artificially reduced below the 
amount shown in the original data In slightly skew distnbu- 

( 139 ) 



tions the frequencies are satisfactorily reproduced and many 
of the graduations of such material are excellent, but the 
method can haijdly be considered satisfactory as a general 
formula until some method of overcoming the difSculty 
mentioned above ^las been found 

The difficulty in connection with Type A is the large part 
that negative frequencies play in some of the less symmetrical 
graduations If a negative frequency occurs, have the positive 
frequencies been overstated? The defence of such negatives is 
that further terms of the series would put things right, but it is 
hard to see the justification for basing much argument on 
constants derived from the higher moments which are hable 
to large variations and are unrehable It is also unsatisfactory 
that a curve cannot reproduce itself even approximately, 
and the result of our Example VI is disappointing, probably 
however it would be well to consider such cases as relating to 
heterogeneous material and therefore more suitable for re- 
presentation by two or more superimposed curves * If Type A 
or Edgeworth’s curve and their moments could be integrated 
from —atob instead of from — cc to oo, the difficulties could be 
overcome to some extent, but, failing that, it would seem 
necessary to limit the range' of apphcability to the less ab- 
normal distributions An approximate method of fitting from 
— to 00 has been givenf, but the results are not quite so good 
as Pearson’s Type III 

11. If the reader makes any extensive trial with series for 
the purpose of graduation, he will find occasionally that the 
coefficients of successive terms are such as to imply that the 
series may not be convergent. This is closely connected with 
the difficulty mentioned m the preceding paragraph 

* We are doubtful if it in statistically possible ever to produce a double hump 
with Type A or JEdge worth’s cuivo if the ordinaiy - co to oo integration is per- 
formed, because the relative values of the second and fourth moments required 
by the coefficient in the formula would seem impossible 

t E C Rhodes, J Roy /Statist /Soc 1925, pp 576 et seq 


IT 


( 140 ) 



CHAPTER VII 


CORRELATION 

« 

1 . We say that tall men have longer legs than short men, 
that the older a bachelor the less likely he is to marry and have 
children, that ajnan marrying late in life nsually takes a vafe 
who IS older than the wife of a man marrying early, or, to take 
an example from hfe assurance'^ractice, that, when endowment 
assurances are grouped according to the unexpired term, the 
mean ages at maturity increase with the unexpired term All 
these statements express m different words the fact that there 
IS some causal relationship, or correlation, between the height 
of a man and the length of his legs, between the ages of husband 
and wife or between the age at maturity of endowment 
assurances and the unexpired term The statements are, how- 
ever, in general terms; they do not help us to decide whether 
one relationship is closer than another, they do not supply any 
scale of correlation The object, in statistical work, is to find 
a measure, we have a scale for measuring probability and 
similarly we want a scale for measuring correlation 

This suggests that if there is no correlation our scale ought 
to measure zero and, just as certamty is mdicated by a pro- 
babihty of umty, so we may call our correlation unity when the 
relationship is a§^ close as possible There is, however, one point 
where the analogy between probability and correlation breaks 
down, there is no such thing as negative probability, but we 
can easily see that we can have negative correlation, for we may 
have two things, A and B, which increase together like the 
ages of husband and wife, or two things, C and X>, one of which 
increases as the other decreases like the age of a bachelor and 
the number of children born from subsequent marriages 

2. With this introduction we may set down a defimtion of 
correlation in the following words, '"two measurable charac- 

( 141 ) 



teristics, A and B, are said to be correlated when, with 
different values x oi A, we do not find the same value y oi B 
equall}" likely to be associated ” In other words, certain values 
of B are more likely to occur with the value x than others If 
they were not, correlation would be absent, or, to take a 
specific case, if men marrying at 20, or at 30, or^t 60, or at any 
other age always married women of 40, there would be no 
correlation On the other hand, the correlation would be 
perfect if every man had to marry a woman exactly n years his 
3 unior 


Unexpired 
term of 
endowment 
assmances 
(centie of 
gioup of 

5 toims) 

Central age at maturity 

Total 

Mean 
maturity 
age for 
the row 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

2 

2 



2 

26 

6 

14 

6 



56 

53 75 


24 

20 

10 

12 

8 

4 

0 

4 





7 

1 

1 

2 

6 

62 

36 

40 

22 

2 


172 

55 03 


18 

I.') 

12 

0 

G 

3 

0 

3 

b 




12 


2 

9 

17 

117 

99 

127 

52 

8 

1 

432 

55 85 



10 

8 

0 

4 

2 

0 

2 

4 

6 



17 

3 


6 

24 

145 

155 

237 

84 

11 


665 

56 59 


b 

5 

i 

3 

2 

1 

0 

1 

2 

3 



22 


1 


3 

133 

167 

271 

78 

20 

1 

674 

57 58 


0 

0 

0 

0 

0 

0 

0 

0 

0 

0 



27 




9 

90 

123 

231 

71 

11 

3 

538 

57 88 





3 

2 

1 

0 

1 

2 

3 



32 




1 

11 

49 

127 

49 

8 

2 

247 

59 94 





G 

1 

2 

0 

2 

4 

6 



37 






6 

49 

22 



77 

61 04 







3 

0 

3 





42 






1 2 

2 

3 


1 

8 

62 50 







4 

0 

4 

8 

12 



47 








1 



1 

65 00 








0 

5 


















Total 

6 

4 

17 

62 

584 

643 

1098 

388 

60 

8 

2,870 


Mean un-] 













expired 1 
term for [ 

10 3 

13 2 

13 2 

16 1 

17 2 

20 1 

21 9 

21 7 

21 5 

27 6 



column ) 














Notes I'd explanation of small numbeia, see § 10 

A column oi row is called an array The middle value of the vaiiable with 
which the low is associated is called its type, so that the third column (i e that 
headed 40) would be called the y-ariay of type 40, and the fourth row would 
be called the a-array of type 17 The word ‘type’ is sometimes omitted 

( 142 ) 




3. The statistical aspect of the problem is exemphfied in 
the above table of double entry which gives particulars of 
2,870 endowment assurances grouped according to unexpired 
term 

A httle examination of the table shows that correlation is 
present, for we notice that the figures m the column givmg the 
mean maturity age for each row increase steadily from 53 75 
to 65, while the term increases from 2 to 47 Similarly, the 
mean unexpired terms increase from 10 3 to 27 6 as the age at 
maturity increases from 30 to 75. The two sets of figures are 
mdicated m the diagram, p 1 44 Now let us imagme that there 
was no correlation, then the means of the columns would have 
been independent of the other function, that is, we should have 
found the same mean for each column When plotted on a 
diagram the means would have run horizontally This suggests 
that, perhaps, correlation might be measured by the slope of a 
straight hne drawn through the means, and we may follow up 
this idea by fitting a straight Ime {y — b^x) to the correla- 
tion table and seeing what we can gather from the result 



4. When we were fitting curves to frequency distributions 
we used the Method of Moments, and the following proof 
adopts the same principle 


( 143 ) 



Let etc be associated deviations, and let 

y = a^-^h^x 

be the straight hne used in the graduation, then the graduated 
figure corresponding to is + 

Now, if we proceed as we did in fitting frequency-curves by 



Note The mean unexpned terms corresponding to actual central ages at 
maturity are shown x and the mean central ages at maturity correspondmg to 
actual unexpired terms are shown o 

The diagram is arranged so that the standard deviation of the maturity ages 
IS represented by the same length as the standard deviation of the unexpired 
terms and, consequently, the angles formed by the two regression hues with their 
respective axes are equal The tangent of this angle in each case is r ( 254) 

( 144 ) 



the method of moments, we make the graduated and un- 
graduated areas, means, etc. equal, or 

(^2 + &2 ^i) + { d ^ + ^2^2) + . . . = + ^2 ”1" * • 

or Na^-^b^S'{x) = S'{y) 

And {a^-\-b^x^x^ + {a^ + h^x^)x ^+ .. 
or a^8'{x) -f- b^8'{x^) == S'{xy) 

where S' {x), bemg the sum of all the x^s, gives the jfirst moment 
of the x's, S'(y) t;he first moment of the y's, S'(x^) the second 
moment for the x^s, and S' {xy) a moment m which any fre- 
quency IS multiphed by the product of the distances in the x 
and y directions * 

If these moments are now transferred to the mean, as was 
done in fitting the frequency-curves, we have 

Na^ = 0 or <^2 = ^ 
and b^8{x^) = S{xy) or 

But we have already seen that the second moment of the 
whole frequency (A) is Ncr-^^, therefore 


and 


8{xy) 

Ncrl 


2 / = 


8(xy) 
Ncrl ^ 


If we now write S{xy) *= we have 


0*2 ^ 
y = r—x 

X — r~y 


where r will represent the statistical measure of correlation 
(coefficient of correlation) between the x's and ^’s and the 
second equation has been evolved similarly to the first. 


* Cp Table II, p 16. The frequency 29 is multiphed by the appropriate 
value ( - 4) It would be the same thing if we took the distance ( - 4) of each 
of the 29 cases and added these twenty-nine { - 4) ’s "together 

( HS ) 


EFC 



5, At jSxst sight it may appear that the two equations just 
given, showing the relationship between x and y, are not 
consistent It must, however, be remembered that the first, 

y ~ gives the mean values of y correspondmg to parti- 
^ 1 

^ r 

cular values of x (indicated by the insertion of ad)ar over the y), 
while the second gives the mean values of x corresponding to 
particular values of y To take a simple case as an example, 
assume that = 0*2 = 1 and that r = 1, then if = 0 the mean 
of the 2 /’s corresponding to this value of x is 0, and lix = 20 the 
mean of the y’s will be 2 When we turn the matter round, 
however, we cannot, of course, assert that the mean of the x's 
correspondmg to y = 2 is 20, it wiU be 2 

6. After this prehmmary remark we may return to the two 
equations and consider how it is that ns a measure of correla- 
tion and whether it can always be treated as a satisfactory 
measure. We can best see that ? is a measure of correlation by 

rewriting the equation y = r~^x in the form = r — or 

T = , and we can then interpret it as giving one charac- 

teristic m terms of the other where the mean is the origin (this 
IS due to referring moments to the mean in the proof) and the 
unit of measurement is the standard deviation in each case 
In this form we see at once that as one characteristic (X) 
increases the mean ( X) of the corresponding series of the other 
characteristic increases to an extent which depends on the 
value of r ; while if r is negative Y decreases^ It is only if r is 
unity that the increments of X and Y become equal and ab- 
solute correlation is reached If Y remains constant as the 
value of X increases, the definition at the beginning of this 
chapter tells us that there is no correlation, and r in this case 
IS zero as can easily be seen from the equation Y = Xr, We 
have anticipated that our scale for measuring correlation 
should run from ~ 1 to -f- 1, but we may accentuate the fact 
that a large negative value does not mean that the two 
characteristics do not vary together but only that increases m 

( 146 ) 



the one correspond with decreases m the other, the numerical 
value of r mdicates the extent to which variations in the two 
characteristics correspond This indication is satisfactory pro- 
vided the means, when plotted m a diagram such as that on 
p 1 44, fall approximately in a straight line (i.e ' ' regression ’ ’ * 
IS hnear) Distmct deviations from hnearity are not so common 
as might be supposed, but if they are very marked in any case, 
r ceases to be a satisfactory measure of the correlation 

7. We may fake this opportumty of removing another 
difficulty that is sometimes met Some students have a doubt 
which is best shown by the question ' ' How can there be perfect 
correlation when one thing is always smaller than another 
As an example we may take the correlation between the lengths 
of a man’s right arm and his left arm, here the coefficient of 
correlation would be practically umty, and smce each cha- 
racteristic IS measured from its own mean, and in terms of its 
own standard deviation, the coefficient would not be decreased 
if every left arm was a certain number of mches shorter than 
the right or if it bore a fixed relation m length, say 99/100, to 
the right arm 

8. It is now necessary to discuss the arithmetical calculations 
and if we look back at the formulae at the end of § 4 we see that 
we require two standard deviations and a value for S{xy) We 
have already seen how standard deviations are obtained and it 
will be remembered that when the calculation of moments was 
discussed we found that, though they were required about the 
mean, it was be^ in practice to take them about some point 
fixed arbitrarily so as to avoid fractions and then adjust the 
results afterwards The values of the (X^ and can, of course, 
be found with the help of the formula on p. 57, viz 

The deduction of from the second moment should be made 
for the same reason and in the same cases as in frequency- 
curve fitting 

* The term “regression” was adopted by Francis Galton m connection with 
the study of heredity, it indicates the way the children of particular parents 
tend to “step back” to the ordmary population me%n 

( H7 ) 


10-2 



5* At first sight it may appear that the two equations just 
given, showmg the relationship between x and y, are not 
consistent It must, however, be remembered that the first, 

y ^ gives the mean values of y corresponding to parti- 

r 

cular values of x (indicated by the insertion of a*bar over the y), 
while the second gives the mean values of x corresponding to 
particular values of y To take a simple case as an example, 
assume that cTj = 0 * 2 = land that r = ‘l,thenif;r = 0 the mean 
of the 2 /’s corresponding to this value of x is 0, and if a:; = 20 the 
mean of the y'^ will be 2 When we turn the matter round, 
however, we cannot, of course, assert that the mean of the x'^ 
correspondmg to 2 / = 2 is 20, it will be 2 

6. After this prehmmary remark we may return to the two 
equations and consider how it is that ns a measure of correla- 
tion and whether it can always be treated as a satisfactory 
measure. We can best see that ns a measure of correlation by 

cr y X 

rewriting the equation y == r-^x in the form — = r — or 
__ . cTi ^2 <^i 

Y = Xr, and we can then interpret it as giving one charac- 
teristic in terms of the other where the mean is the origin (this 
IS due to referring moments to the mean in the proof) and the 
unit of measurement is the standard deviation in each case 
In this form we see at once that as one characteristic {X) 
increases the mean ( T) of the corresponding series of the other 
characteristic increases to an extent which depends on the 
value of r , while if r is negative Y decreases^ It is only if r is 
umty that the increments of X and Y become equal and ab- 
solute correlation is reached If Y remains constant as the 
value of X increases, the defimtion at the begmmng of this 
chapter tells us that there is no correlation, and r in this case 
is zero as can easily be seen from the equation Y == Xr. We 
have anticipated that our scale for measuring correlation 
should run from — 1 to -hi, but we may accentuate the fact 
that a large negative value does not mean that the two 
charactenstics do not vary together but only that mcreases in 

( 146 ) 



the one correspond with decreases in the other, the numerical 
value of r mdicates the extent to which variations m the two 
characteristics correspond This indication is satisfactory pro- 
vided the means, when plotted m a diagram such as that on 
p 144, fall approximately in a straight Ime (i e ' ' regression ’ ’ * 
IS hnear) Distinct deviations from hnearity are not so common 
as might be supposed, but if they are very marked in any case, 
r ceases to be a satisfactory measure of the correlation. 

7. We may Ijake this opportumty of removmg another 
difficulty that is sometimes met Some students have a doubt 
which IS best shown by the question . ' ‘ How can there be perfect 
correlation when one thmg is always smaller than another^ ’’ 
As an example we may take the correlation between the lengths 
of a man’s right arm and his left arm, here the coefficient of 
correlation would be practically umty, and since each cha- 
racteristic IS measured from its own mean, and m terms of its 
own standard deviation, the coefficient would not be decreased 
if every left arm was a certam number of mches shorter than 
the right or if it bore a fixed relation in length, say 99/100, to 
the right arm. 

8. It is now necessary to discuss the arithmetical calculations 
and if we look back at the formulae at the end of § 4 we see that 
we require two standard deviations and a value for S{xy), We 
have already seen how standard deviations are obtained and it 
wiU be remembered that when the calculation of moments was 
discussed we found that, though they were required about the 
mean, it was be^ in practice to take them about some point 
fixed arbitrarily so as to avoid fractions and then adjust the 
results afterwards The values of the cr^ and cr^ can, of course, 
be found with the help of the formula on p 57, viz v^~ — d^. 
The deduction of from the second moment should be made 
for the same reason and in the same cases as in frequency- 
curve fitting 

* The term “regression” was adopted by Francis Galton m connection with 
the study of heredity, it mdicates the way the children of particular parents 
tend to “step back” to the ordinary population me^n. 

( 147 ) 


10-2 



With regard to the product moment we have 
=S{x + d^) (y + d^) 

^ = S{xy) + dj^S{y)+d2S{x) + Nd^d2, 
or smce S{x) ^ S{y) = 0 

S{xy) = S{x'y') — Nd^d 2 

where S{x'y') is calculated about a point distant d^ from the 
mean of the x's and d^ from the mean of the ^’s 

9. The statistical example on p 142 can now be worked 
through. It will be found to make the proofs and methods 
given above much easier to grasp 

A point about which moments are to be calculated is first 
fixed, say the middle of the group corresponding to maturity 
age 60 and unexpired term 22 years, and for the present the 
calculations are made about this point The following table 
shows the calculation of the mean and the second moment of 
the totals of the ^/-arrays, i e the totals at the bottom of the 
table, because columns are ^/-arrays and rows ^-arrays 


Frequency 

a;' 

Fiequency x x' 

Fiequency x {x'y 

6 

-6 

36 

216 

4 

-5 

20 

100 

17 

-4 

68 

272 

62 

-3 

186 

558 

584 

-2 

1,168 

2,336 

643 

-1 

643 

643 

1,098 

0 

-2,121 


388 

1 

388 

388 

60 

2 

120 

240 

8 

3 

24 

72 

2,870 =iV 


+ 532 
- 1,589 

4,825 


, _ 1589 

■“ 2870 


- 55366 


Hence, the mean age = 60 — 2 7683 = 57 2317, because the 
umt of groupmg is 5 j'ears. 


( 148 ) 




~ Wq^ '~^\~T 2 (Sheppard’s adjustment) 

= 1 37465 --OSS 
= 1 29132 
CTj = 1,13637 

Treating the rows m the same way, the following table was 
formed 


Frequency 

y' 

Frequency x y' 

Frequency x 

56 

-4 

224 

896 

172 

-3 

516 

1,548 

432 

-2 

864 

1,728 

665 

-1 

665 

665 

674 

0 

-2,269 


538 

1 

538 

538 

247 

2 

494 

988 

77 

3 

231 

693 

8 

4 

32 

128 

1 

5 

5 

25 

2,870 =i^ 


+ 1,300 
- 969 

7,209 


Mean unexpired term = 22 — 1-68815 = 20-31185 
2 7209 „ j 

= 2-31453 

and 0-2 = 1-52135 

10. The value of S{xy) is formed with the help of the 
numbers in very small type appearmg under the frequencies 
in the correlation table The frequency 62 in the 50 column, 
for instance, is distanced three spaces upwards and two side- 
ways from the arbitrary ongin, so the value of x' y' by which it 
has to be multiphed is 3 x 2 = 6, as showm in the small type 
The other figures are obtained in like manner, but the sign 

( H9 ) 




must be borne in mind. Any value from the left-hand upper 
division of the table, or m the lower right-hand division, will be 
positive, becauserthe frequency wiU be multiplied by a product 
of an X and y having hke signs, while any value from the other 
divisions will be negative, because the x and y by which the 
frequencies are multiplied are of opposite sign^. 

The calculation of the product moment is as follows 


Frequencies 

a;' y' 

Total of 
frequencies 

if) 

fxx'y' 

155 4-71 -84-123 

1 

+ 19 

+ 19 

145 4-99 4-114-49-11 -52-49-90 

2 

102 

204 

24-^364-34-22-22-6-9 

3 

48 

144 

64 - 64 - 84 - 3 - 6 - 8 - 11 - 24-117 

4 

113 

452 

1 

5 

1 

5 

34-17+62 4-2-1 -1 -2 

6 

80 

480 

9 + 26 

8 

35 

280 

6 

9 

6 

54 

2 

10 

2 

20 

2 + 2 + 1 

12 

5 

60 

1 

15 

1 

15 

1 

18 

1 

18 

2 

24 

2 

48 




1,799 


S{xy) = 8{x'y')-Nd^d^ 
== ll^^-Nd^d^ 

- 1262-51 


S{xy) _ 1262 51 

^ ~ Nar-icr^ ~ 2870 x 1 13637 x 1-52135 
= -254 ^ 

The coefficient of correlation between age at maturity and 
the unexpired term of endowment assurances is -254. 

The equation representing the one function in terms of the 
other is 

X — r-^y 

= dOOy 

where aU measurements are made from the mean and the umt 
is 5 years The hne drawn in the figure gives this result. 

( 150 ) 




11 • An alternative method similar to the summation 
method given in §9, Chapter III for moments can be con- 
veniently used in connection with correlation tables 

Takmg the same example, we obtam &om the given table 
another in the same form, givmg the y sum of it by summing 
each column ^ontmuously, and then form a third table by 
summmg the second table across contmuously. 


Table of the y-sum of Correlation Table 


Unexpired 
term of 
endowment 
assurances 

Central age at maturity 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

Totals 

2 

6 

4 

17 

62 

584 

643 

1,098 

388 

60 

8 

2,870 

7 

4 

4 

17 

60 

558 

637 

1,084 

382 

60 

8 

2,814 

12 

3 

3 

15 

54 

496 

601 

1,044 

360 

58 

8 

2,642 

17 

3 

1 

6 

37 

379 

502 

917 

308 

50 

7 

2,210 

22 

0 

1 

0 

13 

234 

347 

680 

224 

39 

7 

1,545 

27 

0 

0 

0 

10 

101 

180 

409 

146 

19 

6 

871 

32 

0 

0 

0 

1 

11 

57 

178 

75 

8 

3 

333 

37 

0 

0 

0 

0 

0 

8 

51 

26 

0 

1 

86 

42 

0 

0 

0 

0 

0 

2 

2 

4 

0 

1 

9 

47 

0 

0 

0 

0 

0 

0 

0 

1 1 

0 

0 

1 

Totals 

16 

13 

55 

237 

2,363 

2,977 

5,463 

1,914 

294 

49 

13,381 


Table of x-sum of above Table, i e Table giving all 
cases for xy group and over in Correlation Table 


Unexpiced 
term of 
endowment 
assurances 

Central age at maturity 

30 

35 ^ 

40 

45 

50 

55 

60 

65 

70 

75 

Totals 

2 

2,870 

2,864 

2,860 

2,843 

2,781 

2,197 

1,554 

456 

68 

8 

18,501 

7 

2,814 

2,810 

2,806 

2,789 

2,729 

2,171 

1,534 

450 

68 

8 

18,179 

12 

2,642 

2,639 

2,636 

2,621 

2,567 

2,071 

1,470 

426 

66 

8 

17,146 

17 

2,210 

2,207 

2,206 

2,200 

2,163 

1,784 

1,282 

365 

57 

7 

14,481 

22 

1,545 

1,545 

1,544 

1,544 

1,531 

1,297 

950 

270 

46 

7 

10,279 

27 

871 

871 

871 

871 

861 

760 

580 

171 

25 

6 

5,887 

32 

333 

333 

333 

333 

332 

321 

264 

86 

11 

3 

2,349 

37 

86 

86 

86 

86 

86 

86 

78 

27 

1 

1 

623 

42 

9 

9 

9 

9 

9 

9 

7 

5 

1 

1 

68 

47 

1 

1 

1 

1 

1 

1 

1 

1 

1 

0 

0 

8 

Totals 

13,381 

13,365 

13,352 

1 13,297 

13,060 

10,697 

7,720 

2,257 

343 

49 

87,521 


( 151 ) 






The totals in the right-hand column of the upper table give 
the first sum of the total in the right-hand column of the 
correlation table, and are the same as the column rr = 30 in the 
lower table The total of the y sum, or of the first column in the 
xy table, gives the mean of the y'^ (13,381/2,870), and simi- 
larly the sum of tne first row gives the meaufof the x's 

(18,501/2,870) 

The total of the last table gives the xy moment (87,521), and 
the X standard deviation is found by forming from the first row 
the series 18501, 15631, 12767, 9907, 7064, 4283, 2086, 532, 
76, 8, and summing it, i e 70,855 The second moment about 
the mean can then be found, the numerical work being as 
follows 

18501 

mean = - 6-4463 


= 2S2 — d{l + d) 
2 X 70855 


2870 
= 1-3747 

Similarly with the y moments 

13381 


- 6 4463 X 7 4463 


y mean 


2870 


= 4-6624 


2 (13381 + 10511 -1- 7697 -f 5055 -h 2845 

+ + + 6624x56624 


2870 


= 2-2312 


87521 

The xy moment = — 6 4463 x 4-6624 

^ 2870 


= 4399 


Remembering that 1^2 ~ A (Sheppard’s adjustment) = 
and that the means are, in the above work, measured from the 
centre of the group ^ = 25, y = — 3 years, the values just 

( 152 ) 



given will be found to agree with those previously obtained by 
the direct method. The xy moment (-4399) is the same as 
1262-5 

, 1 e. 8{xy)lB 

12. We have already remarked (§ 6) that the method we have 
used for measuiiing correlation assumes that the means of the 
rows and of the columns, respectively, he on straight hnes and 
consequently we must examine a table to see whether this 
holds One advantage of the method given m the previous 
paragraph is that it enables us to get the means of each 
column and of each row very easily. Remembering* 

( 1 ) that the mterval between the groups of unexpired terms 
IS 5 years 

(2) that the mterval between central ages at maturity is 
5 years 

( 3 ) that the arbitrary origin is the point representmg central 
age at maturity = 25 and unexpired term = — 3 

we can get the means of the columns by takmg each total m the 
2 /-sum table, multiplymg by 5, dividing by the number of cases 
and subtracting 3, thus, for the column with central age 50 


2363x5 

584 


-3 = 


17-23 


The means of the rows come from the differences of the totals 
on the right of the next table and thus for unexpired term 2 we 
have ^ 

(18501- 18179) X 5/56 + 25 = 53 75 

and for unexpired term 17 

(14481 - 10279) X 5/665 + 25 = 56 59 

13. There is yet a third way of domg the arithmetical work to 
reach the coefficient of correlation and, as it is short and rehes 
on one of the senes of means, it has a good deal to commend it 
The calculation is as follows 


( IS3 ) 



First moment of 
column (for total 
frequency m column) 
about arbitrary 
origm of columns 
(i e unexpired 
term 22) 

(1) '■ 

Distance of 
column from the 
arbitrary origin 
of rows 
(i e age 60) 

(2) 

(l)x(2) 

- 14 

-6 

+ 84 

- 7 

-5 

+ 35 

- 30 

-4 

+ 120 

- 73 

-3 

+ 219 

-557 

-2 

"+1,114 

-238 

-1 

+ 238 


0 


- 26 

+ 1 

1 - 26 

- 6 

+ 2 

- 12 

+ 9 

+ 3 

+ 27 


1 

1,799 


r = 


/Total of (3) 
\ N 



= •254 as in SlO. 


The unit throughout is 5 years and the easiest way to do the 
calculation for col (1) is as shown in the table on p 155 

There is no need to insert a column for age 60, or a row for 
term 22, as these are multiphed by zero, they are sometimes 
worked out for completeness and because they make it easier 
to apply arithmetical checks which the reader can evolve for 
himself 

If the reader considers any item in this scheme, eg 18 in 
column headed 40, he will see that it represents 9 cases in the 
table (p 142) multiplied by —2, and when it is, amongst 
other numbers, taken to col (1) of the table above, it will be 
multiphed by -4, that is, we shall have multiphed 9 by 
( — 2) X ( — 4), 1 e by 8, which is the little figure written under 9 
in the table on p 142 

Before dealing with other examples and methods, it may be 
well to point out a use to which the particular example might 
be put The result in the equation form gives the average age 
corresponding to eagh unexpired term Now, we might weight 

( 154 ) 




{Frequency in column) x {distance from arbitrary origin) • 


Maturity 

age 

30 

35 

40 

1 

45 

50 

55 . 

6o 

70 

75 

Distance 










from arbi- 




“Mmus” products 




trary origm 

* 









-4 

8 



8 

104 

24 

24 



-3 

3 

3 

6 

18 

186 

108 

66 

6 


- 2 


4 

18 

34 

234 

198 

104 

16 

2 

-1 

3 


6 

24 

145 

155 

84 

11 


Total minus 

14 

7 

30 

84 

669 

485 

278 

33 

2 


“Plus” products 

+ 1 




9 

90 

123 

71 

11 

3 

+ 2 




2 

22 

98 

98 

16 

4 

+ 3 






18 

66 



+ 4 






8 

12 


4 

+ 5 







5 



Total plus 




11 

112 

247 

252 

27 

11 

Figs for 
col (1) 

-14 

-7 

-30 

-73 

-557 

-238 

-26 

-6 

+ 9 


each entry with Lidstone’s or with the temporary an- 
nuities, then work out an equation in each case, and get new 
series of average ages . The results used in a valuation would give 
the relative accuracy of the three methods I have worked out 
the formula with the Z weights (H^ Table), and found that 

Age at maturity = 57*595-1- T200 x (unexpired term) 

The results coulcf^lso be used as a rough check on the average 
ages at valuations, and there certainly seems a possibility of 
doing something towards makmg a simple ''model office'’ for 
endowment assurances with the help of the method we have 
been using 

* The method used by me was appioximate and can probably be improved, 
the result is merely given as an mdieation of a possible Ime for research 


( 155 ) 





CHAPTER VIII 


THEORETICAL DISTRIBUTIONS 
SPURIOUS CORRELATION 

In the previous chapter we saw that rt was natural to 
want a scale for measuring correlation and we showed that if 
we simphfy the full table by fitting a straight hne*^ to the 
statistics, then its slope might be taken as a measure of corre- 
lation But, though this seems reasonable on the evidence, it is 
not conclusive, it might be better to use some function of the 
slope of the line rather than the slope itself, or, we might find 
from experience that a straight hne wa^ not the best thing to 
use in our simplification We have, therefore, to see if these 
doubts can be removed, and a way to do this is to consider 
correlation from a theoretical standpoint by building up tables 
in which we can estimate the amount of correlation from 
general considerations 

2. Various correlation tables can be devised, but we may 
begin by taking a case where ten coins are tossed and eight 
of them are left on the table, the other two being re-tossed 
Then we have a pair of tossings in which eight coins out of ten 
are common to each member of the pair We repeat the experi- 
ment a number of times and produce a correlation table in 
which, as eight out of ten coins are fixed, we may expect the 
correlation to be measured by 8 or at any rate by a function of 
8 Similarly, if we leave 5 coins the coefficient should be *5 
and if we leave 2 coins it should be 2 The tables worked out 
theoretically would be as shown on pp 165-7 These tables 
are symmetrical, the two standard deviations are the same and 

* We reach two straight lines for each correlation table, one corresponding 
to the means of the columns and the other correspondmg to the means of the 
rows r 


( 156 ) 



the means (see last column and bottom row) run in a straight 
hne and the slope of the hne, judged by the tangent of the angle 
it makes with the horizontal, is 8 or -5 or 

3. We may indicate how the tables were formed by taking 
the one that gives the numbers when eig|it corns are left 
Consider those «ases in which there were ten ''heads” at the 
first throw Then all the eight corns left must be "heads The 
other two on bemg re-tossed will be m the following pro- 
portions. 

2 Heads 1 case 

1 Head and 1 Tail ... 2 cases 

2 Tails . ... .. lease 

Thus, with the eight heads left, we conclude that for four cases 
produemg 10 heads at the first throw, one will produce 
10 heads at the second throw, two wiU produce 9 heads, and 
one will produce 8 heads Now consider the next case where 
there are mne "heads ” and one "tail ” at the first throw. Then 
we can leave either eight heads or seven heads and one tail; 
the number of ways m which we can do this is gCg and that 

is 9 and 36 or, as we are only concerned with proportions, as 
1 4 The two re-tossed coins wiU be thrown m the proportion 
of 1HH:2HT ITT , and we can then produce the second 
column. The reader will appreciate that the totals of the 
columns will be a multiple of the terms of the binomial ( J -h 

The coefficients worked out by the methods of Chapter VII 
give the values 8, 5 and -2 For example for the *8 table 
will be found to 8192 and o'! = 0^2 = Therefore 

Sjxy) _ 8192 

Ncrjcr 2 4096 x 2-5 

4. Let us see if we can use these tables to help us to decide 
whether 8 or a function of *8 should be used as the measure or 
coefficient of correlation An easy experiment is to add the 8 
table and the -2 table together after mcreasmg the former so 
that the two tables represent the disti^ibution of the same total 
number of cases The result of such a process is that the means 

( 157 ) 



of the rows (or columns) will be half-way between those shown 
in the two tables from which the composite table is formed. 
These means are^ identical with those of the *5 table, although 
the distribution of the cases is different. This is evidence that 
we can assume that -8 or -5 or 2 is a proper coefficient of 
correlation and that we need not speculate w^.th functions of 
these figures If we generahse our result we may say that we 
want to find a function of r such that 

and this is satisfied by writing f{r) — r ^ 

5. It wiU, however, be noticed that we have chosen a 
particular case where the distribution is based on a sym- 
metrical binomial and it does not foUow that other cases will 
be so easy to interpret We can, however, form similar tables 
with dice where we regard two of the six faces as ‘"head” and 
four other faces as ^'tail’’ We then get distributions of the 
form + the means of the columns (or rows) are in a 
straight hne and we reach the means of the 5 table by adding 
together equally large tables giving correlation of 8 and 2. 
The r = -8 table is given on p. 168. Admittedly we have even 
now only dealt with tables of double entry corresponding to 
frequency distributions like the binomial series and we cannot 
expect all the distributions that occur in practice to be so 
simple We must not assume that in every case the means will 
follow a straight hne nor are we entitled to jay that the slope 
of the straight hne will give a correct measure of correlation 
if the distribution diverges considerably from those discussed, 
but the large majority of correlation tables conform approxi- 
mately to the type we have indicated 

6 The reader will have noticed that in the work we have 
just been doing we have dealt with a series of points analogous 
to the binomial series and not with a surface analogous to a 
frequency-curve The normal curve with which we dealt in a 
previous chapter i^ in certain conditions, the hmit of the 

( iS8 ) 



binomial senes and the frequency surface* correspondmg with 
the normal curve is 


Z = 






^xyr 


2 |a-i=( !-?•“) (Tz^l—T^) 




Now, if ? = 0 this expression reduces to the product of two 

^00 /*oo 

normal curves and if we find Zxydxdy, we reach 

J — 00 J —00 

-»T {.^y) moment , 

Nrcr-^cr^ or r = — • N(x a have already supposed wi 

Chapter VII. 

7. The normal surface has some properties to which special 
attention may be called. If we examme the distribution of an 
array of type t, we see that it is 


vhere h and are written for the longer expressions in cr-^, 
or 2 and r 

Makmg the index a perfect square, we have 

j fit] * 

Z=Z^e-°^ r"Fd + 


= f- 

which IS a normal distribution having the same standard 
deviation as that of the whole surface, but its mean differs 
from that of the whole surface by htjg^ It follows that 

(1) the deviation of the mean of the array is directly 
proportional to the type, or, m other words, the means 
of arrays inJJrease or decrease in arithmetical progression 
and so he on a straight Ime, 

(2) the standard deviations of aU parallel arrays are equal 
and independent of their types 


So far as the former of these conclusions is concerned, we 
have the same property as that found in our com-tossmg tables 
and assumed in the previous chapter The other property is 
not found with our com-tossmg tables. It must not be con- 


* See Appendix III c 


( 159 ) 



eluded from tins that the normal surface has so small a scope 
as to be of httle practical use, it has, probably, a far larger 
scope than the analogous normal curve has in frequency-curve 
work It may help the reader to visuahse the surface if he bears 
m mmd the follo;wing points * 

(1) vertical sections cut parallel to the axis of x or the axis 
of y are normal curves, 

^ (2) the contour hnes are elhpses and if these ellipses are 
projected on to the plane Z — 0 they are concentric, 
similar and similarly situated 

The appearance would be that of an isolated hill standing on a 
wide plain This plam rises very slowly as we approach the hill, 
then the hillside becomes gradually steeper until, as we near 
the top, it becomes less steep and the top is nearly, but not 
qmte, flat The hill is narrowest when seen from the north''* 
west or south-east and widest when seen from the north-east 
or south-west 

8. We may now discuss a danger against which we must be 
on guard m statistical work on correlation The danger is that 
correlation may be revealed when it is absent, or exaggerated 
when present, in consequence of the arrangement of the 
statistical material We will consider two causes of the intro- 
duction of this ‘'spurious correlation ” The first may be taken 
from our com-tossmg tables We saw that by adding together 
the 8 and *2 tables in equal proportions we reached a table 
which gave a correlation of *5. But let see what would 
happen if we added together two tables where r = 5 but 
shifted the mean of one of them This might happen in practical 
work if two persons, recording similar objects, measured 
correctly except that one always overstated his results by a 
constant figure The results are then amalgamated and the 
table formed might then be similar to the table on p 169 The 
coefficient of correlation is worked out and found to be *78 

9. We may now consider how we might detect the cases in 
which this sort of thing happens The means of the various 

( i6o ) 



rows run in a particular way, they begin and end as the r = *5 
tables but, when the amalgamation conies in, the run of the 
line is such as to jom the two end pieces together with a curved 
line. Agam, the totals do not form a bmomial or any single 
frequency-curve. In the particular case these two pomts would 
be sufficient w^mng, but in practice it is hard to apply them 
because the ends of an experience, being based on relatively 
small numbers, obscure the real shape of the regression lines 
and the curve formed by the totals. There may also be many 
observers mstead of only two, and these observers might turn 
the end pieces mto curved hues and give a regression hne hke 
a flattened S The real remedy m such cases is to see that the 
various experiences grouped together are ahke as regards both 
their means and their distributions and to use amalgamated 
flgures only when the amalgamation is justified 

10 . Another way m which a spurious correlation may be 
introduced arises through the use of mdices. As an example we 
may refer to endowment assurances by limited payments on 
the books of a company domg a large quantity of such busmess 
and consider the term of the origmal assurance (^i), the number 
of premiums to be paid m future and the number of years 
for which the pohcy has been m force (^3) If we formed the 
ratios ^2/^1 and and worked out the coefficients of correla- 
tion, we should not obtam a measure of the correlation between 
number of premiums payable in future and the number of 
years in force because the result of usmg fractions with the 
same denommatqj* m each would be to exaggerate correlation 
— that is, to introduce spurious correlation. 

The general propositions of spurious correlation, of which 
the result just mentioned is a particular case, are as follows* 

I To find the mean of an index in terms of the means, standard 
deviations and coefficient of correlation of the two absolute 
measurements. 

Let x-^, x^, x^ be the absolute sizes of any four correlated 
subjects, m^, m^, m3, m^ their mean values, cr-^, cr^, cr^, or^ their 

( 161 ) 


EFC 


II 



standard deviations, 7*23, ^34, the six coefficients 

of correlation, e^, €3, e^, the deviations of the four subjects 
from their means, 1 e. % = + etc , ^^3 the mean value of 

the index and ^34 the mean value of and the 

standard deviatiopcs of the indices xjx^ and x^jx^ respectively, 
and N the total number of groups 

We shall suppose the ratios of the deviations from the mean 
values of the organs are so small that their cubes may be 
neglected Then 



1 

N Wo 


^ 1 



l 8{e^) Sje^e,) ^ Sje.r] 

N*m^\ m3 m^mg m§ j 


But S{ 6 ^) = S{e^) = 0 and S{eje^) = NcryO-^r^^ and S{e^)^ = Nal 


and 


m3 \ m| mi 
— + 

m4 \ m| mg 




II. To find the standard deviation 0/ an index in terms of the 
standard deviations and coefficient of correlation of the two 
absolute measurements. 


NxE^,^ = 8 


or 


r-r 


= ^!3 's(- 


^^3 


(mi m. 


+ square termsj 


\ m\ ml Wi m3 ^7 


^13 = 


'13 


m| m| 


1 f ^ 2 — ^ 


mi m3 

( 162 ) 


'13 



Ill To find the coefficient of correlation of two indices in terms 
of the coefficients of correlation of four absolute measurements and 
their standard deviations, ^ 

Let and xjx^ be the two indices. 

Then, ifp be the coefficient of correlatiomof the two indices, 



Proposition I shows that the mean of an index is not the 
ratio of the means of the corresponding absolute measurements, 
and Proposition III shows that the/? will vanish when the four 
subjects formmg^the indices are quite uncorrelated, while, if 
two, say, the third and fourth, are identical, so that r^^ = 1 and 
orjm^ = <xjm^, we have 



This would become applicable in the case of endowment 
assurances by limited pa3nnents to whiqji we referred 

( 163 ) ir-2 



An interesting special case arises when the subjects Xq, 
are not correlated and Xj^/x^ and xjx^ are formed, then 



11. The practical lessons about spurious correlation to be 
learnt from the foregoing are (1) to deal with homogeneous 
data and not to be too certain about the value of a coefficient 
in the case of amalgamated experiences until you are sure that 
those experiences are homogeneous, (2) to avoid making, or be 
careful in interpreting, correlation tables where the functions 
correlated are expressed as indices in which the denominators 
are identical or may themselves be correlated 

We may add that spurious correlation may arise when the- 
correlated pairs relate to successive years, and so are not taken 
at random as regards time If, however, the correlation be- 
tween the two nth differences becomes equal to the correlation 
between the two (n -f- l)th differences, we reach the correlation 
independent of time, provided the dependence of each variable 
on time takes the form a + bt-\-bt^+ . . 


( 164 ) 



Coin-tossings with ten coins in 'pairs Eight coins common to each member of pair 





( i66 ) 


columii 




Two coins common to each member of pair 


Mean 

of 

row 

OGqTHCOOOOCqTHCDQOO 


Total 

cDoooooqooooco 
iQcDcqcqcor-tcD(jqcqco)0 
cq ‘o cq 

cqrHocO'Tf^'coOi-roq 
r-H CO 10 0 10 CO 1— 1 

rH 

rH 

Tils 

cq 

CO 

Cl 

■» 

f 

DO 

2 

crt 

S 

00 

1 

0 

0 

iz; 

0 

I — 1 

f-H 00 00 <0 0 CO 00 00 rH , 
CM xo i> 10 cq 

256 

60 

0 

<M’^OCOQOOqTj<OCOOO 

cqoqcoooc^o-rHco 
i-H CO lO CO lo cq 

2,660 

68 

00 

I— 1THT}^OOOT^^THT^^COOOO 
cqoocqxO'^oo(Mi-i'rt<cq 
r-i t> c- t> 00 0 05 cq 
cq" cq" cq" 

11,520 

56 

i> 

OOOOOT}HTj<rHOOO'<#>tiHCO 

cMcqcDcoooococqoxo 

1-1 Ti^cq^xo cq 0 10 

cq xc i> i> cq 

S rH 

XO 

CO 

00 CO 0 cq 00 CO 0 tH cq 0 
cqcoxoi;oc5cqxoooooi>t> 

CO t> cq rH CO 05 (M 00 CO 

i-T xo 0 CO r-H t> cq" ! 

»— 1 pH rH 

63,760 

62 


coooxHTf^oocqoo^'Tt^ooco 
xoooxtiocqi>cqO'^ooxo 
xo tH xo CO 0 CO xo t^xo 
cq r> CO CO CO i> cq" 

fH iH pH 

64,512 

50 


ocq'<!j<ocooocqT*(ocooo 

)t>«r-oOQOxocqC5cDxococq 

CO Q0^cq^05 CO IH cq I> CO 
cq I> iH CO 0 xo pH 

63,760 

48 

CO 

CO'^t*<000-^'^tJ<00000 

xoocqcoooocococqcq 

XO CD 1> cq xo cq rH l> ih 
cq" rH i> i> »o cq 

30,720 

46 

cq 

oOOCOrHrHrHOOO'^rHi— 1 
cqrHpHCqoOrHiOCqoocq 
cq <35 CD 00 (>!>!> pH 
cq cq cq iH 

11,520 

44 

1—1 

ooooTHcqoocoo-^cq 
corHOt>oococqcq 
cq XO CO XO CO 1— 1 

2,660 

42 

0 

iHOOOOCOOcOOOOOiH 
cq XO r> XO Cl 

266 

40 

No of heads 
m second 
tossmg 

O'~iCqc0rHx0C0I>00 05O 

m 

Total 

Mean of ) 
column j 


( 167 ) 




CHAPTER IX 


CORRELATION OF CHARACTERS NOT 
QUANTITATIVELY MEASURABLE 

^1 . Before the theory in this section is discussed we will give 
a table showing the class of problem with which it deals, 
drawn from vaccination statistics and relating to the Sheffield 
smallpox outbreak of 1887-8** 


Degree of effective 
vaccination 

Strength to resist Smallpox when incurred 

Cicatrix 

Recoveiies 

Deaths 

Total 

Present 

Absent 

3,951 

278 

200 

274 

4,151 

552 

Total 

4,229 

474 

4,703 


The characters with which we are concerned are “Strength 
to resist smallpox when incurred’’ and “Degree of effective 
vaccination”, and the statistics cannot be arranged in a more 
detailed manner The characters cannot be measured quanti- 
tatively, but as the absence of such measurement does not 
mean that there is no correlation, w^e must see how the coefficient 
can be obtained in such a case 
2 . Let us consider this problem in the firs u place by seeing if 
we can write down a few cases in which we can assign a value 
to the coefficient of correlation from general considerations. If 
we toss a com, it must come down ‘ ‘ head ” or “ tail ”, if we form 
pairs as in Chapter VIII, by pairing consecutive tossings, there 
will be no correlation and in a table such as that of the previous 

* Biometrika, i, 375 et seq This paper, by W R Macdonell, and a supple- 
mentary one deal with the subject m a way that shows clearly the strength of 
the evidence on the side of vaccmation The question of class is investigated, 
a practical pomt frequently neglected 

( 170 ) 




paragraph there would be an equal number ui each division 
But if we made a pair by leaving the com on the table, and 
countmg it a second time, we should have absolute correlation 
and we should have m our table an equal number m the top 
left-hand and bottom right-hand divisions, the other two 
divisions bein^ blank If we amalgamate these two tables, 
assu m ing that the total of each is 4, we reach the table shown 
below, having a coefficient of correlation of -5 


Second 

Fibst tossing 

Total 

tossing 

Head 

Tail 

Head 

3 

1 

4 

Tail 

1 

3 

4 

Total 

4 

4 

8 


• In these simple cases, where the four divisions represent the 
frequencies at four separate pomts, the correct value of r is 
given by the expression 

ad — be 

^J{{a + b){c + d){a + c){b-{- d)} 

where a, 6, c and d have the meamngs indicated m the scheme 
of §5. 

3. It does not, however, follow that this expression is one 
which may be used in all circumstances as, though there are 
only four divisions, the things measured may imply a conti- 
nuous scale of measurement even though we cannot or do not 
express it in deta4ed fashion Thus, in our vaccmation statistics, 
the degree of successful vaccmation may vary between a 
vaccmation m infancy, for a person aged 40 at the time of the 
epidemic, and a series of vaccmations, the last of which has 
been recently performed Agam, the power of recovery when 
attacked may also be deemed to lie on a longer scale than that 
imphed by the two divisions ‘‘recoveries” and “deaths” To 
take another example we could, if we were studymg eye- 
colour in parent and offspring, make a scale of colour from 
black down to pale blue (or to absence of pigment m albinotic 

( 171 ) 




cases), but the statistics might be available merely in the form 
''brown’’ and "not brown”. In other words the statistics in 
a four-fold correlation table may relate to a continuous 
frequency distribution like the table on p 142, but owing to 
the way the facts ^ad to be stated or collected there are only 
four divisions for the whole of the material Our first problem 
is to see whether the simple formula at the end of § 2 will give 
a satisfactory answer in these circumstances and if it fails what 
alternative may be adopted. o 

4. In Chapter VIII we gave tables based on coin-tossing and 
we might group the material of one of these tables into four 
divisions and see what answer the formula m question gives 
If we take the table having a correlation of * 5 and cut it between 
the 5 "heads” and 6 "heads”, we reach the following: 


Number of 
heads in 
second 
tossing 

Number of heads in 

FIRST TOSSING 

Total 

0-5 

6-10 


0-5 

6-10 

15,330 

5,086 

5,086 

7,266 

20,416 

12,352 

Total 

20,416 

12,352 

32,768 


The formula gives 

15330 X 7266 - 5086 x 5086 

~ -iA 

20416 X 12352 

which is far removed from the true value of *5 

5. It is clear from this evidence that we must look for 
another solution and having seen in the previous chapter that 
we could express a frequency surface as 

AT 1 1 2rocy\ 

Z = P 21-»-®\cri“'^ers® ct^cTz/ 

27r^{l-r^)<T^cr2 

we may now consider what conclusions we may draw if we 
divide this surface into four parts by two planes at right angles 
to the axes of x and y at distances h' and h' from the origin, as 
suggested by the figures on p. 173 

( 172 ) 





Then 


d = 


N 


rco ^00 

^1*^2 J ft' J ft' 


2TT^{l'<-')^)cr- 

= ^ f” r”e-2-r 

27t^( 1-7^)J n J i 


1 1 /x" 2rj:^ \ 

e ^i-rHo-r dxdy 




dxdy 


X^ *2/^ 

by substituting for and 2 /^ for ^ 


and writing 
Further 


A = — and k == — 
<Ti cr. 


6 + cZ = 


N 


2 

1 


V(27r)crJ 
N 




and 


c-{-d = 


V(27r) 

iV 

V(27r) 


f‘ 

J 

j‘ 

J ^ 


hf 

e-^^'^dx 


O-W 


dy 


and, remembering that N the total frequency = a + b-\-c-{-d, 
we have 

IT 
Trh 


N-2{b + d)^N-N 
(aH- c) — (6 H-d) 


2 p 
TJ h 


N 


2 r 


e~'^^^dx 


and, similarly. 


{€b-\-b) — {c-\-d) 
N 


e-^^v^dy 


As a, 6, c, and d are known, h and k can be found from 
Sheppard’s Tables, and the problem becomes 


To find a value for r from the equation 
e 


N 


O 00 

k 


2n^{l-r^) 
where d, N, h, and h |ire known ” 

( 174 ) 





The solution (see Appendix IV) leads to the following 
equation 

_ hf * ^2 ai3 a 4 ’* 


+ jgQ - 6^2 + 3) (i* - 67,2 + 3) 


+ — - 10^2 +15) j^g^A _ lop + 15) 






X (Zj® — 15^^+ 45^;^— 15) + etc. 


where 


If = and K = 


V(277)'' —--^i27T)' 

The numerical solution has to be obtained by approximatmg 
Jbo the roots, and Newton’s method* is convenient for the 
purpose 

6. The numerical work of our first example is as follows. 


{a + c)-{b + d) 3756 

n] 0 ~ N - 47 ^ 

= -7984265 
h = 1 27716 


by interpolation m Sheppard’s Tables (see Tables for StaUsU- 
Clans) In using these tables for this purpose, remember that 
the value -7984265 corresponds to a in his notation, so 

\ (1 + -7984265) = -8992132 

must be looked up mversely in his Table n. If his Table m be 
used, it must be entered with -7984265. 

* Newton^ s method of approximating to the root of an equation Let f{x)—0 
be an equation from which, the value of x is to be found and let & be a value 
near to x so that x=b+h where h is small, then f{x)=f{h+h)~f{h) +hf'{h) + 
terms involvmg higher powers of h by Taylor’s Theorem, and smce f{x) — 0, we 

have h= -= 7 ” or x = h-^~rs The chief objection to the method is that there 

/ W f {o) 

may be more than one root near the value h, but this does not hold m the applica- 
tion to correlation (Cf Approximations to rate of interest from an annuity, 
Todhunter’s Interest and Annuities Certain, p 177, fprmula 2 ) 

( 175 ) 



Similarly 


•7652561 


h = 1-18833 

We next require , and we first get fpom Sheppard’s 

Tfliblcs 

1764870 /. logif = 1-2467127 

K= 1969111 log Z = 1-2942702 

Hence 

and ^ 1-336062 

Dr Macdonell gives 56 instead of 62 as the last two figures 
the difference is probably due to interpolation 

Turning to the expression for r, we notice that hh is a product 
in the coeflBlcients of r^, r®, etc , so it is well to work out its 
value and keep a note of it while the coefficients are being 
found. It IS also advisable to begin the work by writing down 
the first SIX or seven powers of h and h 
Macdonell gives the following senes. 

•097083r7 + *008170^6 + -119614^5 + . 137450^4 

+ -043352^ 3 .758344^2 + ^ 1*336056 


In order to obtain r we must find a value'rnear the true one 
as a first approximation. 


Takmg 
we have 


•758844r2 + r-l 336056 = 0 


-1+41 + 4x 1 336x 7588} 
1*5177 


= •8 


Now, this value wiU be in excess of the truth because we have 
used only two terms^of the senes on the left-hand side of the 

( 176 ) 



equation for finding r, and we may take •11 as a trial rate. 
Applying Newton’s Rule, we have: 

-1 336056 + (-77) + -7588(-77)2+ 043’4(-77)3 

+ -1376( 77)‘+ 1196( 77)5 + •0082( 77)6+ •0971(-77)’ 
^ l + 2( 77) (-7588) + 3(-77)2 (-0434) + 4<-77)3 (-1375) 

-i'»5( 77)4 ( 1196) + 6(-77)5 (-0082) + 7( 77)® (-0971) 


= 77- 


•0022 

2-861 


= -7692 


In work such as this a table giving the first seven powers of 
the natural numbers is a help 

7. Tables of various functions required for the arithmetical 
work will be found m Tables for Statisticians The term 
''tetrachoric functions”* is employed there. These tables are 

arranged so that we can use the equation 

djN = + + etc. 

where the values of r are tabulated up to , and farther values 
can be obtained by a difference formula given in the intro- 
duction to the tables. 

The calculation of the coefficient has been set out above in 
detail, but with the help of Tables vin and ix of Tables for 
Statisticians, Part n, much of the work can be avoided All 
that has to be done, if these tables are available, is (1) to 
calculate h and k as shown m § 6, (2) to calculate the ratio that 
the number in quadrant d bears to the total number of cases, 
1 e. djN, and (3) tD mterpolate in the tables so as to obtain r. 

8. We may now return to our com-tossmg, and we find that 
if we work out the coefficient of correlation for the table m § 4 


* The tetraohoric functions are closely allied to the Hermite polynomials and 
provide the fullest tables available. The 5th tetrachonc function is (cp p 130) 

(-1)^-^ - 

(2,7)' 

and the {s - 


'-TsW = ' 


- l)th Hermite poljmomial is 
( - 1)^-^ 


^.-xW = - 


5 ( v ( 2 , r )® ^'‘0 


V{2^) 


EFC 


( 177 ) 


12 



by the method just discussed, we reach a value of between *51 
and -52 This is a good result, especially when we remember that 
the com-tossingus not an absolutely continuous scale 

The broad conclusion that may be reached is that the assump- 
tions lead to reasonable results in the kind of cases we have 
tested The method does not work so well when the frequency 
surface is cut far from the mean and the numerical results m 
such cases should not be assumed to have minute accuracy 
r9. We have assumed that the data available are only divided 
into four divisions and we shall postpone till later (Chapter 
XII) the discussion of correlation when the characteristics are 
not quantitatively measurable but are divided into several 
categories We may, however, now deal with the case in which 
one variate is and the other is not quantitatively measurable 
as, for instance, in the table on p 179 relating to the effect of 
enlarged glands on the weight of children (boys) * Though then 
statistics are divided into good and bad glands, the condition 
of glands is a contmuous variate some of the boys with bad 
glands were worse than others 

If the reader considers a volume of frequency built out of a 
complete table such as that for endowment assurances, or out 
of a correlation table giving relative ages of husbands and 
wives, he will see that he has a complete distribution Now, if 
a volume of frequency be cut off from such a complete volume 
by a vertical plane at a given value of one variate, then the 
vertical through the centroid of this volume cuts the regression 
hne The vertical plane in the two-row table is at the division 
of the rows, in our example where the good glands end and the 
bad glands begin. If p and q be the co-ordinates of the point of 
section where the vertical through the centroid of the volume 
cuts the regression hne, then we have, (Ti and cr^ being the 
standard deviations of the two variates and r the correlation, 

f = or r = 

^ 1 / ^2 

* The method is given by K Pearson in Biometnicay vn, 96 et seq , and the 
example is taken from tha-t paper 

( 178 ) 



Weight 

Boys with 
good glands 

Boys with 
bad glands 

Total 

14 

2 


* 2 

16 

3 

5 

8 

18 

15 

26 

41 

20 

20 

40 , 

60 

22 

28 

47 

75 

24 • 

34 

30 

64 

26 

30 

31 

61 

28 

29 

20 

49 

30 

30 

30 

60 

32 

21 

14 

35 

34 

18 

11 

29 

36 

18 

5 i 

23 

38 

6 

7 ^ 

13 

40 

5 

2 

7 

42 

7 

3 

10 

44 

1 


1 

46 




48 

3 


3 

50 

1 


1 

52 

1 


1 

62 

2 


2 

Total 

274 

271 

545 


Now ^ IS the mean value of the quantitatively measurable 
variate for all the pairs with a certain one of the alternative 
variates, in our example, the mean weight of boys with bad 
glands and cr^ is the standard deviation of all the boys We 
cannot calculate q and 0*2 in a similar way, because they relate 
to glands of which no quantitative measure is available If we 
assume the non-measurable variate (glands) to follow the 
normal probabihty distribution, the proportion of the non- 
measurable variate gives, with the help of tables of the 
probability integral, the ratio of jr/cTg for the distance &om the 
mean at which the division of this variate occurs, and then 


g_ 


N 


r ^ f" 


\/(27r)o-2 
1 


427r) 




dy 


e-iy'dy 


00 _ 
e 


yicTz 


III 


The numerator is 2: and the denommator J(l — a) in the 
notation used in Sheppard’s tables of th^ probabihty integral. 


( 179 ) 


12-2 




The working of the numerical example may help to make the 
method clearer, it is as follows 

The mean weight of all the boys is . . ... 27*7522 

The standard deviation IS ... . . 6 7502 

c 

The mean weight of boys with bad glands 1S^ 27*3737 

J(l-a) = 271/545 = *4972 
J(1 + a) — 5028 and this value corresponds with 
2 : = 3989 in Sheppard’s tables 

The correlation of glands and weight is 

(27 7522-27 3737) / 3989 
6*7502 / *4972 

= 070 

It may be remarked that the use of |(1 — a) assumes that 
the column with the smaller total frequency will be taken , thus, 
in our example, there are fewer boys with bad glands than with 
good glands 

10 . This example suggests a practical point, namely, that, 
before actually working out a coefficient of correlation, it is 
advisable to look at the statistics and form a preliminary idea 
of whether there is any correlation. In tables such as that on 
p 142 there is no correlation if all the means of the rows are 
ahke and all the means of the columns are alike. Similarly, 
in tables, such as the one on p 170, if the entries within the 
table are proportional to the totals there is no correlation In 
the example in § 9 above a comparison of the tiwo inner columns 
with the total column shows that if there is any correlation it 
must be small because the distribution in the total would give 
a possible ‘"graduation” of each of the inner columns. 


( 180 ) 



CHAPTER X 


• STANDARD ERRORS 

1 . In statistical work we calculate a mean from a number of 
measurements, and we may be tempted to think that our work 
has defimtely established the mean with which we are con- 
cerned The arithmetical work may be correct in every detail 
and the measurements may have been made accurately, but 
the mean found from the statistics may differ from the true 
mean of the character measured because the thmgs measured 
are himted in number — because, in other words, the sample we 
• have taken does not exactly represent the unlimited popula- 
tion from which it is drawn If we toss five coins and record 
the number of heads, we should obtam a table like the following 
where we give results of 140 repetitions of the experiment * 


Number 
of “heads” 
in tiial 

Number of trials 
m which the num- 
ber of heads m 
previous column 
was recorded 

5 

4 

4 

24 

3 

49 

2 

40 

1 

20 

^ 0 

3 


140 


Now, treating this as a mere statistical problem, in which we 
do not know a priori anything about the true distribution, we 
may work out the mean number of “heads” as 2*6 This does 
not prove that this mean and no other can arise, a second 

* Sucli an experiment may, alternatively, be regarded as drawmg a random 
sample of 140 cases from an infinite population distributed as (4 + ^)® 

( i8i ) 




experiment might give a different result and we cannot, there- 
fore, say from our calculation what the mean value really is 
We can, however, approach the problem in another way, we 
can try to decide how deviations from a true mean are hkely to 
be distributed andrso form an opimon as to how a mean calcu- 
lated from an experiment or series of experiirCbnts will differ 
from the truth For practical purposes we might rest content if 
we could say that the true mean will not differ from the calcu- 
lated mean by more than a small quantity (e) once in a hundred 
trials Before we go mto the measures actually used, let us 
consider some of the points in a prehminary way 
All our statistical experience makes us feel sure that an 
experiment based on 1,000 cases must b.e more rehable than a 
similar experiment based on 50 cases, so we anticipate that e, 
the small difference between the true and the calculated mean, 
will depend m some way on the number of cases Again, a ^ 
distribution that spreads widely gives the mean more op- 
portumty to deviate than a distribution that is concentrated; 
so that we may also anticipate that e will depend on the spread 
of the distribution, that is, on its standard deviation 
2. These remarks apply to all statistical measures The 
measures are inexact and only approximate to the truth, but 
we can say that it is highly probable that they do not differ by 
more than a certain amount from the result which would be 
obtained if we could deal with an unlimited number of facts 
In our discussion we have spoken of means, but every other 
measure is subject to the same general considerations and we 
must, therefore, consider what sort of value may be assigned 
to the small error e for means, standard deviations, coefficients 
of correlation and other measures. We have anticipated that 
this error will depend on a standard deviation, and this some- 
times leads to a httle confusion because we must make up our 
minds as to the distribution to which the standard deviation 
refers Let us imagine that we have worked out a coefficient 
of correlation for 100 pairs of, say, ages at marriage of husband 
and wife. Then we wo^'k out a second coejfi&cient of correlation 

( i 82 ) 



for another hundred pairs and go on till we have a large number 
of these results The coefficients will faU into a distribution like 
one of the frequency distributions we discussed in earher 
chapters and it is the standard deviation of that distribution 
from its mean with which we are concernedi, Similarly, we can 
repeat the coir^tossing experiment of § 1 over and over again 
and calculate the mean from each experiment We may obtain 
any value for the mean between 5 and 0 heads; these extremes 
will only arise in* the most unlikely case when every trial g^ve 
5 heads or every trial gave no head The most likely mean is 
2 5 and if we repeat the experiment sufficiently we shall form 
an idea of the way the means are distributed we shall reach a 
frequency distribution of means havmg its own mean, standard 
deviation, etc , and we must not confuse it with the frequency 
distributions such as that in the table in § 1 from which the 
• means were calculated. It will help to avoid confusion of ideas 
if we speak of standard error” when we are referring to the 
frequency distribution of a statistical measure (such as a mean, 
coefficient of correlation, etc ) instead of speaking of standard 
deviation ’ ’ The standard error is, then, the standard deviation 
of the frequency distribution of the particular measure we are 
examining 

3. With this introduction we may now consider the simplest 
of the bell-shaped frequency curves, namely, the normal curve 
of error, and see what conclusions we may draw if the distribu- 
tion of a statistical measure takes that form It has, m 
fact, been shown to be the form that the distribution of 
statistical measures tends to assume when the number of cases 
in the sample is large. Thus, even if the distribution in § 1 had 
been skew instead of symmetrical, the distribution of the 
means would have been more nearly of the form of the normal 
curve than the skew distributions from which they were 
obtained By reference to the tables of this curve we see that 
the area corresponding to the standard deviation is about two- 
thirds of the whole area, while the area corresponding to twice 
the standard deviation is -9545 of the vjhole area. 

( 183 ) 




In other words, if the distribution takes this form we can say 
than an error of more than twice the standard error will occur 
9 times in 200 trials and is, therefore, unhkely to have arisen in 
the particular case with which we are dealing The diagram will 
help the reader to follow this argument The area between AB 
which IS at a distance equal to the standard deviation from the ^ 
mean one side, and ^'5' at the same distance the other side, of 
the mean, is approximately two-thirds of the whole area The 
Imes CD and (7'Z)', which are twice as far from the mean, 
include nearly the whole curve, the pieces beyond those lines 
are tails which must be of relatively small dimensions 

4, It was formerly the custom to use another function 
known as the probable error, which is 67 449 times the standard 
error The probable erfbr gives that value of x (say p) which 
divides the part of the normal curve representing positive 
errors into two equal portions, it is therefore given by 

where the whole area of the curve (positive plus negative 
deviations) is unity In order to find p in terms of the standard 
deviation, we have, therefore, to obtain the value of x, corre- 
sponding to -|(1 H-a) = *75 in Tables for Statistic%ans, Part i, 
Table ii or short table in Appendix IX, where a is 

( 184 ) 



This can be done by interpolating inversely and p is thus found 
to be 67449 approximately The mean, or rather the vertical 
through the mean, divides the whole distribution mto two 
equal parts the probable error divides it mto fourths and gives 
what Galton called the quartiles The positjon is shown with 
the letters P aM P' m the diagram, thnce the probable error 
includes about the same area as twice the standard error 

We may set down the follownng general rules 

(1) the true value and a calculated value of a mean or other 
characteristic are unlikely to differ by more than twice 
the standard error, 

(2) if an experiment on any subject leads to a result which 
differs from that expected by more than twice the 
standard error we must suspect that we are not dealing 

^ -with a random sample 

5. The problem before us is to consider how statistical 
measures calculated from limited data may vary about the 
expected values Two methods of approach are available We 
may, as indicated in § 2, make a large number of experiments 
— or collect a large number of samples — and calculate the 
statistical measure in question for each of them. The procedure 
would generally be much too laborious, and we take, therefore, 
the second hne of approach Algebraic analysis based on the 
theory of probabihty enables us to determine the standard 
error that we should find in the limit if the sampling process 
were repeated indefinitely so that aU possible samples were 
included in their expected proportions We can often go further 
and determine the actual curve to which the frequency distri- 
bution of a particular statistical measure will tend as the 
number of samples is increased 

We may now take a simple illustration and find the standard 
error of the frequency, say n, "with which an event will happen 
in m independent trials where p is the probability of it hap- 
pening and q of it fading The probability of n being equal to 
m, m— 1, 2, 1, 0 IS given by the t^rms of the bmomial 

( i8s ) 



expansion Taking moments about the point represented by 
the first moment is 

rjfyipm-i + . + mq'^ — mq{p + = mq. 

The second moment about the same point is 

Yupm-i q ^ 2m(m — 1 ) p^-^ q^ -j- |m(m — 1 ) (m — 2^ pm-s qS 

+ .. + m^q^ 

_ fupm-i q _j_ — 1 ) p'^-'^ q^ + ^ — ~ ^m-3 qz 

+ . +mg'^ + m(m“~ + 1) (m — 2) 

+ ... +m(m — 

= mq + m{m— l)q^ 

The second moment about the mean is, therefore, 
mg + w(m— l)q^ — m^q^ = mpq 

/. the standard error = — ^J{mpq) ^ 

That is to say, if we repeatedly make m independent trials 
the observed frequency of occurrence, n, will vary about the 
expected value mp with a standard error of ^|{'mpq) 

6. We may now apply this result to a few examples 
(a) It has been remarked that the number of male children 
born IS to the number of female children born as 1,050 1,000, 
in other words, the probability of a child being male is 
1,050/2,050. If 51,350 out of 100,000 children proved to be 
males in a certain community, would it be safe to base on 
the statistics any theory connected with the variation from 
the usual probability *2 The expected resri^lt is 51,220, and 
the standard error is 


100,000. 


1050 1000\ 
2050*2050/ 


± 158 07 


The difference between the actual case and the expected result 
was 130, and as this is less than the standard error, no defimte 
conclusion can be based on the divergence from the result 
(6) If the number of cases had been 10,000,000, and the 
actual number 5,13^5,000, then the standard error being 

( i86 ) 



l^SSO*? and the actual difference 13,000, it would have been 
sufficient evidence for the conclusion that the ratio 1,050 • 1,000 
did not fit the particular case. 

(c) If the probabihty of death withm a year is -007, the 
probable error in 200 cases is 67449.^(200 007 x *993) = 80, 

and it would, therefore, be possible to approximate to a loading 
for emergencies if 2 2 was taken instead of 1*4 as the number of 
deaths expected in a year out of 200 cases on risk for a year 
The probable error would, I think, be prefera ble to the standard 
error for this purpose That is, it would not be unreasonable to 
treat *011 as the rate of mortahty mstead of *007 m order to 
obtain some idea of an emergency loading for term assurances 
on the assumption that the number of cases is about 200 and 
the average age is such that *007 might be taken as the pro- 
babUity of death in a year. It has also been assumed that 
^t IS correct to treat each class as if it were subject to its 
own rate of mortahty and had to be treated independently 
of the rest of the busmess; that is, however, a debatable 
point 

{d) It will be noticed that if m remams constant, then 
.J{mpq) has its largest numerical value when p = q = which 
shows that an msurance office will generally find that if it has 
two classes of equal size, and one is subject to a higher rate of 
mortahty than the other, the former will have the larger actual 
deviations from the expected number of claims, because the 
probabihty of dying m a year only reaches the value J at the 
end of the mortality table 

7. We may now consider a frequency distribution divided 
into k groups such that the proportion of cases in the sth group 
IS Pq and, clearly, p-i+P 2 + •• •• -^!Pk— 1 ^ 

case at random from this distribution, the chance that it comes 
from the 5th group is p^ and the chance that it comes from some 
other group is q^ = 1— Let us suppose that m cases are 
taken at random and that Ug of them fall m the 5th group Then, 
though the expected value of is mp^, this frequency may 
assume values m, m — 1, ..., 2, 1, 0 with probabihties given by 

( 187 ) 



the terms of the binomial and the standard error of 

Ug will be 

If, in practice, we do not know the exact form of the frequency 
distribution from^ which the sample has been taken, we may 
approximate to the standard error by putting — njm, the 
observed proportion m the 5th group of the sample Hence, we 
have, approximately, 

o-«s = (2) 

8. As the total of all the frequencies %, ^ 2 , . is m, it 
follows that, if in a particular sample is much greater than 
the other frequencies must on the average be too small 
and this shows that the errors between the groups are corre- 
lated The next point to be investigated is the amount of the 
correlation between deviations in the frequencies of the 5th andT 
^th groups 

The deviation, Bn ^ , of n^ from its expected value is n^ — mp^ 

As we are considering the relation between deviations m % 
and n^ we may convemently class together all the remaining 
k — 2 frequency groups into a single remainder group, say, 
Then 

Sn^ 4 - Sni + = 0 

and {dn^ -h 8n^f = ( ~ dn^;)^ 

or dn^dUf^ = \{Sn\ - — 8n^) 

If we now imagine that a very large number, N, of random 
samples is taken and the expressions on both sides of the 
last equation are summed and divided by their number, N, 
then 

W)| 

The expressions on t^ie nght-hand side represent, m the hmit, 

( i88 ) 



the squared standard errors of the group frequencies, given in 
equation (1) above Hence in the limit 


i S{dn,Sni) = -p^) -p^) -p^(l -pf)] 

= Jm{(l -p^ -pt) {p^ +pi) -p,(lrPs) -Pi(l -Pi)} 

= -mp,Pi . ...(3). 

But the correlation between and is 

^ 1 

Limit of ^ SiSn^Suf) 




PsPt 


-Ps)Pi{i -Pi)} 
PsPt 

(i-a)(i-a) 


( 4 ) 


We may again approximate to this expression by substitutmg 
for Pg and p^ the proportionate frequencies, njm and njm, 
of the sample 

9. To find the standard error of the mean of a sample of m 
observations 

Let us again assume a frequency distribution divided into 
Tc groups where is the value of the variable quantity x 
associated with the <sth group. For the reasons already ex- 
plamed in the earher Sections of this Chapter we must distin- 
gmsh between (i) the mean of the population represented by 
the frequency distribution, namely 

where S mdicates summation for all the k groups, and (n) the 
mean calculated from a particular sample of m cases drawn at 
random from this population, namely 

»=^fN 


The standard error of x, say o*-, will provide a measure of the 

( 189 ) 



extent to which the mean of the sample may differ from the 
mean of the population The value of cr^ may be found by 
using the results ( 1 ) and ( 3 ) of the preceding sections 
Using a similar notation, we have 
r Sx = x — X 
1 

As the expected value of Sn^ is zero, the expected value of Sx 
is zero, or the mean value found from repeated sampling of the 
mean of the sample is the same as the mean of the population. 
Squaring both sides of the last equation above, we have 

+ 2r{dn^SntX^Xt)} 

Klh (f 

where E' indicates summation for all pairs of values of s and t 
for which s is not equal to t 

If we now assume a large number, N, of samples to have been 
taken and the corresponding values of (dx)^ summed and the 
result divided by N, we obtain 

^S{Sx? = + 

where S denotes the summation in respect of the N samples 
The left-hand side of this equation is the squared standard 
error of the mean of the sample, or (r| On the right-hand side 

-j^S(dnl) IS the of equation ( 1 ) and -^8(671^671^) is given in 

equation ( 3 ). Hence 

f/v 

TTl \ 


( 190 ) 



where is the second moment, about the origin for x, of the 
distribution of the population But — therefore 

0-5 = ( 5 ) 

We thus find that the standard error of the mean is the ratio of 
the standard deviation in the population to the square root 
of the size of Jhe sample. 

10. This last result is of considerable use m statistical work. 
A large number ^of cases is recorded and the mean used to 
compare the particular experiment with another of a like kmd. 
Is an actual difference between the means due to some cause 
other than random samphngl A practical apphcation would 
be the comparison of the average profit from various classes of 
busmess for a number of years. The standard error of the 
profits in the various years would be obtained by takmg the 

^square root of the second moment about the mean and dividmg 
it by the square root of the number of years, the quotient 
would give (T^ of (5) It is only by usmg the standard errors 
(or probable errors deduced from them) that we could say 
defimtely whether a lower average profit m a certam part 
of the business was due to chance or to some causes requmng 
removal 

11. In § 5 of this chapter it was mentioned that we can often 

determine the actual curve to which the frequency distribution 
of a statistical measure tends We saw, m Chapter IV, that 
and /?2 could, with the mean, be used to fix the frequency-curve 
if it IS of the Pearson family of curves, and it follows that if we 
can find and the frequency distribution of a statistical 

measure we shall have gone a long way towards fixing the form 
of the curve. If we write ^^{x) and ^^{x) as the moment ratios 
for the population distribution of x and and /? 2 (^) 
distribution of x (the sample mean) m repeated samples of 
size m, then it can be shown (see R. Henderson, J Inst. Actu 
XLi, 429) that 

A(^) = [ 

= 3 + {/?2(a:)-3}{m) 

{ 191 ) 


( 6 ) 



Thus if the distribution of x is represented by the normal curve 
for which = 0 and = 3, it is seen that 

= 0 and j3^{x) = 3 

and the distribution of x is also normal Even if the distribu- 
tion of X IS not normal, it follows from equations (5) that ^-^{x) 
approximates to zero and ^^{x) approximates to 3 if m is not 
too small 

12 . The standard error of a standard deviation may be 
taken J — ~| samples and when the dis- 

tribution of the population approximates to the normal curve 
of error (when y? 2 (^) = standard error becomes (Tj^{2m). 

Another standard error which is often useful relates to the 
difference between two percentages or proportions Thus if we 
make trials and the event happens times and in an mde-^ 
pendent m 2 trials we find 712 happemngs, in what circumstances 
can we conclude that = P 2 where the sample estimate of 
Pi IS and of pg ^ 2 /^ 2 ^ The solution might be useful 
when two rates of mortahty, withdrawal or sickness are being 
compared. 

If Pi == P 2 = P, say, the standard error of the difference 

(IM 1 + V'^ 2 )} We do not really 
know p, the underlying propoition to which the pi and pg of 
our experiments approximate, but on the hypothesis that there 
IS a common value we may make an estimate of it from 

K + ^ 2)/(^1 + ^ 2 ) 

This leads to a standard error of 

li ^ + n2 L % + ^ 

V (mi + mg \ mi -1- mg/ \mi mg/ J 


As an example we may take (1) 1000 cases with 22 withdrawals 
givmg a rate of withdrawal of *0220 and (2) 600 cases with 
19 withdrawals giving a rate of 0317. Is the difference 0097 
sigmficant^ The combination of the two experiences gives 
41/1600 or *0256 as the rate of withdrawal. The standard error 

( 192 ) 



by the formula last given is *0225. The difference is not signi- 
ficant. If however the numbers had all been twenty times 
greater, the standard error would have been *005 and it would 
require httle additional evidence to satisfy us that the difference 
IS sigmficant ^ 

13. In similar ways it is possible to find the standard errors 
of the moments and constants, but this leads to the more 
theoretical parts of the subject with which it is madvisable to 
deal m a book of this character It is, however, necessary -to 
call attention to the standard error of the coefficient of 
correlation owing to the importance of that function m 
statistical work. 

As m the case of the mean, it will help to avoid confusion if 
we use a symbol, p, for the correlation coefficient in the popu- 
lation itself different from the s3unbol, r, for the coefficient 
• calculated from a particular sample of m pairs of observations 
From one sample to another r will vary about p and it has been 
shown that the standard error of r is, for large samples,* 

or^ = (1— 1) approximately ... (7) 

If we do not know p we use r as an approximation to it. 

14. This result was first given with ^Jm m the denommator 
by K Pearson and L. N. G Filon as an approximation when 
m is large {Philos Trans A, cxci, 231-41). Later R. A Fisher 
(see Biometnka, x, 507-21) obtamed the exact distribution for 
r when samples are drawn from a population foUowmg the 
normal correlation surface of p 159 above The closeness of 
the approximation by formula (7) as well as the form of the 
samphng distribution of r in such circumstances can be studied 
from tables givenin Tables for Statisticians, Partn, Table xxxn 
or Biomet'iika, xi, 328 et seq. It can be seen from these tables 
that ifp = 0, formula (7) gives a good value for cr^ even for very 
small values of m, but as p becomes larger the approximation is 
less satisfactory partly because the formula does not give a 
close value and partly because, even if be found closely, the 

* When m is large can be used for -^(m - 1) bare and in similar formulae 

EFC ( 193 ) 


13 



distribution of r is such that + and — deviations are not 
equally hkely and the usual rule that twice the standard error 
covers nearly the whole field may not apply. It is difficult to 
give a more definite statement but it may be of help to say 
that if m > 400 formula (7) can be used ^ If however m = 100, 
care is needed in interpreting cr^ unless p is lesrthan *5, and if 
m — 50, unless p is less than 3 

R A Fisher suggested {Metron, i, 1921) an useful trans- 
formation to 

-2{loge(l+^)-loge(l-^)} 

which is distributed normally with a standard error of 

l/V(m-3) 

whatever the value of p 

- 15. As an apphcation of formula (7) we may take the 
example in Chapter VII where we found that the coefficient 
of correlation between the age at maturity and the unexpired 
term of endowment assurances is -254 It is not right however 
to assert that this coefficient exactly represents the correlation 
the real measure may be greater or less, and considerations 
arise similar to those exemplified in § 6 But there is another 
point in connection with a coefficient of correlation — cannot 
even say that there is any real relationship till we have 
examined the standard error In our example = 2870 and 
T = 254, so that cr^ = ± 016. In this case, therefore, the 
standard error is so small that the result is reliable, but if we 
had found r = *073 with a standard error of 05 it would have 
been impossible to say defimtely that the ccrrelation had not 
arisen merely from chance. 

16 . This brings us to an important application of the 
standard error m formula (7) which can be made safely even 
when m is as small as 30 If there is really no correlation, then 
p = 0 and the expression in (7) reduces to 

o-,= l/V(m-l) . ..(8) 

* For m=400, p= 9, we find a^= 00957, by formula (7) is 00951, the 
distribution of r is described by mean r— 8998, mode=: 9011, jSi= 07402, 
j8a=3 1342. This can only be roughly represented by a normal curve 

( T94 ) 



Thus, to go back to the example m Chapter VII, if we assume 
that there is no correlation, r—p = -254 with a standard error 
of 1/^2869 or -0187 The difference, r— p, is well over twelve 
tunes the standard error , it is therefore almost impossible that 
the correlation was zero in the population from which the 
sample of 287<^ may be supposed to have been drawn 

17. Formula (7) above is appropriate only for a coefficient 
of correlation calculated by the method described in Chapter 
VII In using the fourfold table the standard errors are larger, 
as would be expected, because the groupmg is rougher, and the 
formula by which they should strictly be calculated becomes 
comphcated The formula referred to gives as the standard 
error of r. 


1 


where 


\{a-\-d){c + b) ^ ,2 (a + c) ((^ + 6) ^ ^^^^(a + b){d + c) 


[ 4:N^ 


N2 


X = 


^ , , ad — be , ab — cd 

1 




ao — bd\ 

~w\ 


27T^{l-r^) 

h—rk 


g-(722+i2-2r;</t)/2(l~r2) 







e~-^'dx 


e~^^dx 


and it IS assmned that the fourfold table is so arranged that 
a + ob + d and <f, + b> c + d, where a, b, c, and d have the 
meanings indicated on p 173. The numerical work for finding 
the standard error of r for the example in Chapter IX is as 
follows h-rk 


fl = 


V(27r)J 0 




56821 


e~~^^dx ~ *21505 


by Tables for Statisticians, Part i. Table ii 

k—rh 






V(l-r=) 

0 


1 f 322 

( 195 ) 


32230 


e~^-^^dx — *12639 


13 2 



^ f.-Qi^A-k^^2hkrmi-r^) I ^-86744 

^~ 27 t ^{ 1 - i ^) 277x 63900 

= -10462 

logi = -98039, log = T-33254 and log = T-10171, 

X 

the standard error of r is 

.T044(«03)'/{02283 + .OOI45+ 004,9 

+ •00252 -•00408- 01015}= ± 018 

1 8. The standard errors found by this method are larger than 
would result from formula (7) and m many cases are as much 
as three times as great — this actually happens in our example 
The correct formula is rather troublesome, but Tables for 
StaUstic%ans, Part i, Tables xxni and xxiv, based on an 
approximation, minimise the arithmetical work The approx- ^ 
imation can be safely used except when the divisions of the 
correlation table differ extremely 

19. It will be noticed that, as we anticipated, all the expres- 
sions for the standard errors contain the square root of the 
number of cases in the denominator We anticipated in the 
first paragraphs of this chapter that the standard error would 
decrease as the number of cases increased and we can now say 
that in each of the cases discussed the standard error varies 
inversely with the square root of the number of cases The 
student should make it a rule to work out standard errors and 
he will find that much labour can be saved by using tables, 
usually of “probable errors”, that have been published in the 
Tables for Statisticians 

The object in calculating standard errors is to prevent 
ourselves from reading too much from the means or other 
measures we have calculated, but we must not run to the 
opposite extreme and rely more on a standard error than the 
theory justifies Thus, at certain points, our theory has assumed 
that the characteristics are distributed in a form approximating 
to a normal curve o:£ error, and a good deal of evidence has 

( 196 ) 



been produced showing that this is a reasonable assumption 
for many characteristics when the number of cases exceeds 30, 
or for some characteristics with even smaller numbers The 
assumptions imply that plus and minus errors are equally 
probable, but it would not be right to assei^ that the means of 
a sample of a'^ -shaped distribution are equally likely to fall 
above and below the true mean withm twice the standard 
error, and formulae (6) above help to indicate this limita' 
tion 

20. We may now refer briefly to some practical points m 
''samphng” The essence of samphng is that we form an 
opimon of the whole by exammmg a sample of it, and error 
may arise (1) owing to bias m makmg up the sample or (2) 
owing to the particular sample givmg a wide deviation from 
the whole because it is based on a small number of cases. 

It IS, usually, not difficult to guard against bias in actuarial 
or sociological practice Tor instance, if we require to estimate 
the mortahty of lives assured we might collect information 
merely in respect of persons whose names begin with A This 
would give fewer cases, but there is no reason to suspect that 
such hves differ from those whose names begm with the other 
letters of thealphabet Theselectionofaparticularlettermight, 
however, lead to suspicion if it could introduce a question of 
race in a mixed community, e g. m Alsace-Lorraine, if we 
worked with people whose names begm with W we should 
exclude those of French extraction but include those of 
German extraction An alternative is to take one case in, say, 
each hundred, e g the mortahty of hves assured could be 
mvestigated by examimng from the registers of the insurance 
offices every hundredth case 

Samphng of this kmd is useful in social investigations where 
we may, perhaps, want to exanune the home conditions of 
school children and cannot hope to get from every home 
particulars of the health, occupation or habits of the residents 
We might, however, be able to make an exhaustive examma- 
tion of 2,000 or 3,000 cases. With a free-hand it is not difficult 

( 197 ) 



to obtain a random sample, and a little thought and common 
sense is all that is required 

The other risk of error lies in the fact that we have only a 
small sample, and it is here that the subject is connected with 
that of '‘standard errors”. If we may assume that the sample 
is chosen at random and, though not of itself smadl, is small com- 
pared with the population from which it is drawn, then we can 
follow the methods indicated in the earlier part of this chapter 
31. Special circumstances, however, arise in some experi- 
ments, and one type of case may be specially mentioned It is 
frequently necessary to test the comparative yields of different 
varieties of the same plant. The trouble in such a case is that 
plots placed far apart even in a small field produce widely 
different results, but small adjacent plots resemble each other 
In order to make a fair comparison we ought, therefore, to have 
a number of pairs of adjacent plots The comparison is made 
between a number of pairs and we are concerned with the 
differences between these pairs and must work out 

0-2 ,, S{y-x)^ 
m 

where m is the number of "pairs”, and x and y are the corre- 
sponding members of a pair measured from their means 
It is important to distinguish this sort of case, where the 
pair formed from adjacent plots is the unit, from the different 
case where we draw a sample of observations from one 
record with a standard deviation of cr^ and a second sample of 
mg from another record in which the standard deviation is cr^ 
In this case the standard deviation of the difference between the 
two means is given by 



mg 


This assumes that there is no correlation between the variables, 
but in the ' ' pairs ’ ’ problem we have arranged ' ' pairs ’ ’ because 
we expect correlation. Algebraically the correlation is indi- 
cated by the xy term S{y — xY = 8{y^ — 2xy + x^) 

( 198 ) 



It will be appreciated from what has been written elsewhere 
in this chapter that it is assumed that the samples are suffi- 
ciently large to justify the assumption that the cr’s calculated 
from the samples can be treated as the standard deviations of 
the population. 

The use of^the wrong formula may lead to erroneous 
conclusions, the actual difference between the means may be 

S(t/ 

30 , the standard error by cr^ = ^ ■ may be 6, and by 

q -2 Q-2i 

0-2 ^ ]30 12. Judged by the former the difference 

is almost certainly sigmficant, judged by the latter it is 
doubtful 

The kind of problem indicated might arise whenever it is 
necessary to compare the results of alternative methods m 
• changing conditions, and the theory which was worked out 
primarily to test yields may prove valuable elsewhere. 


% 


( 199 ) 



CHAPTER XI 


0 

THE TEST OP GOODNESS Oi’ FIT 

1 . When the values of ordinates and areas were calculated 
in^the examples of the various types of frequency-curves, no 
systematic attempt was made to test the graduations m order 
to ascertain whether the results obtained were reasonable. 
Actuaries have generally been in the habit of imposing on the 
graduated values of any table on which they may have been 
working, rough checks which have amounted to a comparison 
of the totals in various groups and an inspection of the changes 
of sign in the differences between the graduated and un- "" 
graduated figures The problem of the goodness of fit needs, 
however, more accurate treatment , for inspection, even when 
aided by the calculation of a standard error for each group, can 
only tell that certain differences are large, and if the standard 
error be exceeded in two or three cases, it is impossible to say 
whether the excesses are in any way balanced by equalities in 
the rest of the graduation A test is required which wiU give 
some measure of the disagreement as judged by the whole 
graduation 

2. Now, if there be N observations distributed in n 

groups, the numbers in the group being m^, . we 

have to find a criterion to enable us to decide when the series 
m^, mg, m^ will be a legitimate graduation. We may 
clearly take a legitimate graduation to be one in which the 
observed values (m') do not differ from the theoretical (m) by 
more than the deviations that would be expected in random 
sampling What we require to know is not the probability that 
the particular series of m'’s will occur if the m’s represent the 
theory, but the probabihty that the m'’s, or an equally hkely 
or less hkely senes, will arise To appreciate the difficulties of 

( 200 ) 



the problem we may consider the simplest case, that of a com- 
tossing experiment, and suppose that a com has been tossed 
SIX times and come down 4 heads and 2 tails The ' ^ graduation ’ ' 
we make is 3 heads and 3 tails, and to test it we require to find 
the probabihty of obtammg a result as iinhkely, or more 
unlikely than t^e observed one This probabihty is the same as 
that of getting any one of the following results 

6 heads and 0 tails 

5 . 1 ,, 

4 „ 2 „ 

2 „ 4 „ 

1 5J 5 ,, 

0 „ 6 „ 

, It is impossible to calculate such probabilities directly, even 
when the simple probabilities leadmg to the deviations are 
known, m any but the easiest cases, but when we do not know 
the simple probabihties, or the case is a complicated one, a 
further difficulty is introduced owing to our inability to tell 
from a pnon reasoning which of the possible cases are more or 
less likely than that which has actually arisen It would, for 
instance, be difficult to say, without a large amount of arith- 
metical work, when 20 dice were bemg thrown, whether the 
probabihty of gettmg ten ''sixes’’ or more was greater than 
that of gettmg two "sixes” or fewer, but this is an extremely 
simple case compared with the general proposition m which 
deviations over ^ series of numbers have to be considered 
3. liiUis assumed in any measurement on one subject that 
the deviations from the mean take the form of the "normal 
curve of error”, and it is required to estimate the chance of 
obtaimng deviations greater than a certain value {t, say), it will 
be necessary to sum aU values of the normal curve beyond t on 
each side of the mean, i e. we must take 

e~^^^dx+ f e~^^^dx = 2 f e^-^^dx 
00 J t ^ t 

( 201 ) 




and divide the result by the area of the whole curve, i e by 
the total deviations Assuming that there are two measure- 
ments instead of one (the exposed to risk, for instance, at two 
ages), the deviations are as it were, m two directions instead 
of one, and it is ^^necessary to take an expression with two 
variables instead of one The expression analogous to the 
normal curve is the correlation surface 


a; = 


2 fTifTi CTa-J / 


With which we have already dealt The integrations must be 
performed for both variables from t and t* onwards, and com- 
pared with the total If there are n measurements it becomes 
necessary to deal with a function of n variables, and this wiU 
give the reader a slight idea of the problem from the mathe- 
matical point of view, and suggest that he will expect the^^ 
quotient of two ^-fold integrals to give the probability. The 
next step is to reduce these n-fold integrals to the form of 
ordinary integrals, and it has been shown* that the result 


P = 







t 


is reached In this expression % stands for a complex function 
depending on the n variables from which the expression was 
evolved, and measures the position that is indicated by the 
probabihty of the particular distribution, the test for the 
graduation of which is reqmred. 


* Originally by K Peaison “ On the criterion that a given system of deviations 
from the probable m the case of a correlated system of variables is such that 
it can be reasonably supposed to have arisen from Eandom Samphng,” 
Phil Mag , July 1900 A short proof has been given by H. E Soper m “ Fre- 
quency Arrays” 

t A table of P for all values oi + l from 3 to 30, corresponding to 
from 1 to 30, with a few additional values and auxiliary tables for the calculation 
of further values, is given m Tables for Statisticians, Part i An abridged table 
IS given m Appendix IX ^ 


( 202 ) 



4. Before a measure of the probabihty P can be obtained a 
value for % must be found from the statistics of the particular 
graduation, and m the paper to which reference has already 
been made its value is shown to be such that 

It IS natural, almost necessary, to use the square of the 
difference in order that negative differences may, equally with 
positive differences, increase the improbability of the system, 
while a ratio is required to bring into account the size of the 
group, for an error of 15 in a group of 20 would be very large, 
but in a group of 1,000 would be neghgible 

5. The practical aspects of the test of goodness of fit and its 
application may now be dealt with 

• ( 1 ) If the facts representing the graduated and imgraduated 

figures are only available m groups, then the value of the pro^ 
babihty by the test will, as a rule, be lower as the number of 
groups IS increased This practical point should be borne m 
mmd as it sometimes happens that graduations are tested m 
groups of, say, 5 years of age, but the graduated figures for 
individual ages are then used unreservedly, though, strictly 
speaking, they may be no better than interpolated values. 

(2) The test assumes a distribution, and would not be 
applicable if the numbers were a series of ordinates, though the 
apphcation of the test would probably give a fan idea of the 
goodness of fit if a large number of ordinates had been given m 
the series. 

(3) TheM^ails of the experience wiU be very small and never 
fit exactly We ought to take our final theoretical groups to 
cover as much of the tail area as amounts to at least a umt of 
frequency in such cases 

(4) If the number of observations be multiplied by t, say, 
and the deviations are also multiplied by t, then the value of 

will be multiphed by the same figure, and the test will show 
that the fit is worse. This may seem strange at first, but a 

( 203 ) 



little consideration will show that it is reasonable As a large 
number of cases will give smoother series than a small number, 
it follows that if two results are proportionally the same in 
two examples having the same theoretical distribution but 
different total frequencies, the one with greater frequency is 
less probable than the one with less frequency rfThe probabihty 
of a result as bad as, or worse than, three heads and one tail in 
coin-tossing (two heads and two tails being the theoretical 
result) IS 625 , but the probabihty of a result as bad as, or worse 
than, 3x2 = 6 heads and 1x2 = 2 tails is *289 It follows 
that if a distribution is based on, say, 103,480 cases and the 
figures are reduced to a total of 1,000 to show the distribution 
of the cases, then a graduation tested as if 1,000 were the total 
frequency will give the impression that the graduation is far 
closer than it really is 

(5) I have found, m applying the test, that when the num-^ 
bers dealt with are very large the probability is often small, 
even though the curve appears to fit the statistics very closely 
The explanation may be that the statistics with which we deal 
m practice nearly always contain a certain amount of extra- 
neous matter, and the heterogeneity is concealed in a small 
experience by the roughness of the data The increase m the 
number of cases observed removes the roughness, but the 
heterogeneity remains The meaning, from the curve-fitting 
point of view, is that the experience is really made up of more 
than one frequency-curve, but a certain curve, approximating 
to the one calculated, predominates Another possible ex- 
planation is that our solution of the problem depends on the 
assumption of a mathematical expression which dcJSs not give 
exactly the distribution of deviations and when we deal with 
a large experience the approximate nature of the assumptions 
IS revealed 

(6) What IS the actual value of P at which a good fit ends 
and a bad one begins It is impossible to fix such a value We 
have merely a measure of probabihty for the whole table, and 
if the odds against the graduation are twenty or thirty to one 

( 204 ) 



the result is unsatisfactory; if they are ten to one the gradua- 
tion IS not unreasonable, but the exact value when a result 
must be discarded cannot be given As, however, it is clearly 
impossible to imagine any test which can fix an absolutely 
defimte standard, there is no reason for'iobjecting to the 
particular method because it fails to do so 

(7) It IS sometimes thought that the introduction of 
additional constants must necessarily improve the fit of a 
curve. It may do so in some cases, but it is quite possible fo 
take a curve with ten constants and find it gives a worse result 
than another having only three Besides this, there is the 
possibihty of undergraduation, we must not expect to reach 
a very high value for P, e g 95 If we make an experiment m 
coin-tossing, it is unlikely that a single experiment will give 
a distribution very close to the theoretical. If therefore we are 

•estimating the probabihty of getting that result or worse, we 
shall only rarely get a very high or a very smaU value for that 
probabihty We shall do so occasionally, but we must not 
expect it and it is wise to look for explanations when any 
graduation gives a very high or very low value of P 

(8) It may sometimes be advisable to use a curve giving a 
worse agreement than another for simplicity, or for reasons 
such as those which prompt actuaries to employ Makeham’s 
hypothesis. 

6. In a paper ''On the Comparative Reserves of Life 
Assurance Compames, etc ” ( J Inst. Actu. xxxvn, 458-9), 
George Bang remarked that it is permissible to use the 
Model Office for tTie 0^, and it wiU be interestmg to apply the 
formulae^iven above to see what is the probabihty of the 0^^ 
distribution if the be taken as the theoretical distribution. 

In the table on p 206 there are ten groups, and ~ 1*'^^? 
and Tables for StaUsUcians giYG P = *999438 and *991468 when 
^ S'lid 2 respectively It is not, however, sufficient to test 
for 100 new pohcies 950 would reduce the probabihty to 
about *05, which means that m only one case out of twenty 
would a random samphng lead to a system of deviations from 

( 205 ) 



the as great as that shown by the 0^^ This result will 
remmd the student of the great danger of dealing with 
percentages without considering the actual number of cases 
investigated Ehng’s other table, which is of greater importance 
in his work (pohcies according to attained age), shows a much 
closer agreement, as P = -831051 for 10,000 cst^es 


Central 
age in 
^ group 

Policies issued ARRAisroED 

IN AGE-GROUPS 



(Square of 


0'^ 

+ 

- 

20 

6 97 

7 30 

33 


02 

25 

17 75 

20 45 

2 70 


41 

30 

21 04 

23 11 

2 07 


20 

35 

18 41 

18 40 


01 

00 

40 

13 82 

13 05 


77 

04 

45 

9 45 

8 44 


101 

11 

50 

6 23 

5 07 


1 16 

22 

55 

3 51 

2 58 


93 

25 

60 

1 97 

1 20 


77 

•30 

65 

85 

40 


45 

24 


100 00 

100 00 

5 10 

5 10 

\\ 

CD 


7. We will now revert to §2 of this chapter where in stating 
the problem it was said that the N observations were distri- 
buted in + 1 groups. As we have only N observations to 
distribute we can only choose n groups, for having fixed those 
n groups the last one is necessarily fixed, freedom of choice is 
restncted to this extent, and in any problem where the method 
IS used the number of groups where freedom of choice is possible 
must be borne in mind This is imphed in the proofs leading up 
to the formulae which have been given Now^ following on this 
argument the reader may ask whether it is fair in comparing 
a Type I and a Type III graduation of certain matSal to use 
the same value of n when there are four constants necessary 
to reach the former and three to reach the latter He may ask 
are we not really restricting our freedom of choice more m the 
former case than the latter because, to take an extreme case, 
we should reproduce a distribution of only four groups exactly 
with Type I and alter it, that is have freedom of choice, if we 
use Type III 


( 2o6 ) 



8. Before we deal with, this question we may explain that in 
the previous paragraphs of this chapter two distinct problems 
have been covered by the one word ''graduation”. These 
problems are 

I Given a theoretical distribution, to ascectam the probabi- 
lity of getting STL actual distribution or an equally likely or a 
less hkely one 

Here is an example which compares the theoretical number 
of "heads ”, when six coins are tossed, with an actual distribu- 
tion. 


No of 
“heads” 

Theoretical 

Actual 

(2) -(3) 

Square of 
(4) 

(5)/(2) 

(I) 

(2) 

(3) 

(4) 

(5) 

(6) 

0 

1 

0 

1 

1 

100 

1 

6 

6 

0 

0 

00 

2 

15 

12 

3 

9 

60 

3 

20 

23 

-3 

9 

45 1 

4 

15 

18 

-3 

9 

60 1 

5 

6 1 

3 

3 

9 

1 50 ! 

6 

1 

2 

-1 

1 

1 00 ; 

! _ J 

Total 

64 1 

64 


r = ol5 j 


If 7^' = 7 and x^ = 5*15, then P = *52 

II Given a graduation of an actual distribution, to ascertain 
the probabihty that the deviations will be the same as or 
greater than those found. 

The answer depends on the number of constants in the 
formula used for graduation If there are r constants we should 
deduct T from the number of groups instead of deducting umty 
as IS, m effect, doh^ in the last example iovn' — l The mean 
IS used t^pfix the position of the curve and must be counted 
as a constant Consequently we must deduct 3 if the normal 
curve IS used (i e one for the total number of cases, one for the 
mean and one for the s d.), 4 for Type III and 5 for the main 
types.* Generally speaking, the same result is obtained if the 
number of moments used m the calculations be deducted. This 

* In Tables for Statist%c%ans, Part and one is, therefore, already 

deducted It follows that n' ~~ 2 would be the number to be used if the normal 
curve has been used for graduating •• 


( 207 ) 



gives the theoretical answer to the question raised at the end 
of § 7 above It is not always easy to interpret the number 
of moments in applying the rule thus we choose between 
Type III and Type V by usmg the fourth moment, though 
there are only th^ee moments needed to find the constants 
Again if in Type I the start of the curve is fixed^^ three moments 
only are used, while if the range is fixed, only two moments 
are used (and in effect the number of constants is similarly 
decreased) If we make a rough attempt at a graduation by a 
Type I curve usmg four unadjusted moments and then vary 
the start of the curve as indicated m Chapter V, § 10, then the 
final graduation only uses three moments It can be argued 
that the full number of constants has been assumed and four 
moments have really been used 

9. The example given for Problem I in §8 can be used to 

explain the point mentioned in § 5 about undergraduation We 
may, on a particular occasion, reach an actual distribution 
identical with the theoretical zero and P will 

be umty Similarly we may reach a distribution so far from the 
theoretical as to seem well-mgh impossible. One of these 
exceptional cases may appear and if we repeat the experiment 
long enough we shall get distributions giving aU values of P 
Similarly with graduation, we are unhkely, if we know the 
right form of curve, to find a value of P that is infimtesimaUy 
small or very near to umty, but neither is impossible 

10. When we merely want to compare several graduations 
of the same distribution we can often stop our work after the 
calculation of Thus if we make graduations by Type I using 
various adjustments or compare them with Type A^r Type B 
usmg the same number of constants, the lowest value of x^ 
shows the closest graduation. Even if the number of constants 
differs, the value of x^ shows which graduation is actually 
closest and for some actuarial work this may be more important 
than the study of the probabihties 

Bearmg in mmd that there are difficulties in interpreting 
the number of degrees of freedom in some cases, we may 

( 2o8 ) 



consider what is iinphed when we use the solution of Problem I 
for Problem II, All the old applications of the (P, x^) f^st were 
made in this way. In such circumstances we are saying, in 
effect, that the graduation is a theoretical distribution not 
necessarily obtamed from the actual distribution but by 
general reasoning or from other previous experience, and that 
we are measuring the probabihty of divergences from that 
theoretical distribution as great as or greater than those of 
the actual distribution 

The pomts set out in these paragraphs are mentioned 
because it is well to be remmded that we must not read mto a 
good general test of graduation a refinement which is neither 
justified by the underlying theory nor required in practical 
work. 

1 1 . Reference may here be made to a test of a graduation of 
^a mortahty table The data are expressed as “exposed to risk ” 
(E^) at each age (or group of ages) and “deaths’’ (<9J. A 
graduation of the rates of mortahty is made and the “expected 
deaths ” (^' ) are calculated by multiplying the values of by 

the appropriate graduated rates of mortahty {q^). We have, 
therefore, graduated the series 

^X+V ^X+l~ ^X-hV 

by d'x, ^x-^L> ^'x+V ^x +1 - ^x+V 

The E^ IS fixed in each pair, so, if there are 40 ages, there are 
only 40 degrees of freedom, not 80. But the should be 
calculated from aU the 80 values, although when E^ is large 
relatively to 9, as it is at nearly every age, the E — 6 terms give 
zero elenJShts. It will be easier for the reader to foUow this 
argument jf he bears in mind that the total of the 6'b need not 
be reproduced exactly by the ^'’s Deduction will have to 
be made from the 40 degrees of freedom for the number of 
constants used in the graduation 




EFC 


( 209 ) 


H 



f 


CHAPTER XII 


THE CORRELATION RATIO— CONTINGENCY 


r 1. We have seen that we can reasonably use the coefficient 
of correlation when regression is linear, that is when the 
means of the columns (and the means of the rows) are ap- 
proximately in a straight line, but in other circumstances 
its use is open to objection In the present chapter other 
methods are described which are not open to the same ob- 
jection. We shall deal first with a function known as the 
''correlation ratio’’ (t/), which is a useful measure in some^ 
cases 

The value of rjy^ is given by 


^2 

Nal 


.... ( 1 ) 


where is the mean of the y’s corresponding to the particular 
array x, is the number of cases in the array x, N the total 
frequency, (Ty the standard deviation of the y's and y is the 
mean of all the y'^ The summation extends over all the arrays 
In a similar way we can work from the ^/-arrays and have 


Vxy — 


S{ny{Xy-X)^} 

Ncrl 


.... ( 2 ) 


These values of r] will not be the same except in the limiting 
case when regression in both directions is hnear and then 
Vyx ~ Vxy ^ H Will be seen that the correlation ratio Tjy^ can 
alternatively be expressed as the ratio of the standard devia- 
tion of the means of the ^/-arrays, each array being weighted 
with the number in it, to the standard deviation of the y's 
Taking the example on p. 142 we should find rj as follows 


( 210 ) 



Mean unexpired 
term in 
each column 

Vx 

Deduct 

mean of whole 
(20 312) 
Vx-y 

(Vx-yf 

% 

^xiSx-y)^ 

10 333 

-9 979 

99 6 

6 

598 

13 250 

7 062 

49 9 

« 4 

200 

13 176 

7 136 

50 9 

17 

865 

16 113 

• 4 199 

17 6 

62 

1,091 

17 230 

3 082 

9 50 

584 

5,548 

20 141 

171 

029 

643 

19 

21 877 

+ 1565 

2 45 

1,098 

2,690 

21 665 

1353 

1 83 

388 

710 

21 500 

1 188 

141 

60 

85 • 

27 625 

7 313 

53 5 

8 

428 




2,870 

12,234 


= = * 07^67 

2870 X (7 6067)2 

or Tjy^ = -2708 

The figure 7 6067 is the value of cr^ on p. 149, multiphed by 
5 the unit of grouping 

Working similarly with the maturity ages, we obtain the 
following. 



or 9/^y = *2571 

The arithmetical processes described m §§ 11~13 of Ch. VII 
supply us with most of the figures required 

( 2II ) 


14-2 





2. We may now go back to formula (1) and rearrange the 
denominator. Remembering that the square of the standard 
deviation can be found by squaring the difference between each 
observation, and the mean (see Chapter III, § 14), we have 

ISCTy — SS{pj,y 'Sx'^Vx 

= 8B{o^y - + 8{nJ,y^ - yf) + - y) S{o^y - yj} 

X y X X V 

= SS{o^y-y^)^ + Sn^{y^-y)^ 

^ X y X 

as the final expression m the previous hne vamshes. 
Consequently 

SjnxiVz-yf) 

^ s{n,{yJ-m+mo.y-y.? 

X X y 

^{^x{yx~’y)^) measures the amount of variation between 

X 

arrays, while SS{o^y — y^)^ measures the amount of variation ^ 

X V 

Within the arrays. Neither part of the denommator can be 
negative, therefore 

1 ^ Vyx ^ ^ 

It also follows from (3) that for 9 /^ to be large 

X 

must be large as compared with SS{o^y — y^Y, in other 

X y 

words, the larger becomes, the greater the variation in the 
means of the arrays compared to the variations within the 
arrays. Also the smaller becomes, the less important are 
the differences in the means of the arrays 

3. The correlation ratio may be used for three mam 

purposes : i«r7r 

(а) to measure the relationship between x and y — this has 
already been shown in the example, 

(б) to test whether there is any real difference in the 
array means, other than what might be expected from 
samphng, 

(c) to test whether it is reasonable to regard the regression 
hne as a straight hne^. 


( 212 ) 



In dealing with (b) and (c) we must suppose that the 
distribution of y for each x array is not far from a ''normal” 
distribution and that the standard deviations of y arrays for 
given X are approximately equal Under these conditions it 
may be shown that, for the test mentioned m (6) above, if 
the array meafts, say k in number, in the population are all 
equal, so that the population value of is zero, then in a 
sample of N pairs of values of x and y, 

(i) the expected value of say, 

^^^{k-l)l{N^l) (4) 

(n) the standard error 

V{2(^- 1) {N-k)l{N+ 1)} (5) 


Unless, therefore, the observed is larger than, say, 

# we cannot feel confident that it is sigmficant, or that the means 
of the arrays in the population differ The distribution of 
IS however very skew if the number of arrays is smaU, so that a 
deviation of twice the standard error has to be viewed as 
indicated m Chapter X, §19, 

Under the same conditions we can show that, for (c) above, 
if in the population the means of the arrays {y^) he on a 
straight hne, le regression is hnear, and — — then m 

a sample of N pairs of values of x and y the ratio 


wiUhave 

(i) an expected value of 

•» {k-2)liN-2) 

(u)^^tandard error of 


( 6 ) 


^^^{2{k-2){N-k)IN} (7) 


and we can then judge of the departure from hnearity of 
regression in the sample by applying a similar test to that m (6) 
4. In the same numerical example (see the first table in § 1), 
ifc ™ 10, iV' = 2870 and the values from formulae (4) and (5) 

^ = -0031, 


( 213 


are 


(Tf = -0015 



We have actually Tj^ = 0737 so that there is a real difference 
in the array means 

If we take the second table and use the test of formulae (6) 
and (7), we find = 06610 so near to = 06474 that the 
ratio — is *0014. The expected value is 0028 

and the standard error *0014 This table showS linearity The 
first table would hardly have done so 

5. We may now turn to the theory of contmgency which 
gives us another way of approaching correlation and can be 
used when the regression is not linear or when the facts are 
given m a non-quantitative form with a greater number of 
divisions than those of the fourfold tables discussed in 
Chapter IX The principle underlying the theory of contin- 
gency is that a comparison is made between the given table and 
a corresponding table having the same marginal totals but with 
no correlation. The first step, therefore, is to see how to make a 
table without correlation, and a little consideration will show 
that all we have to do is to spht up the total of any column 
in proportion to the distribution of entries in the final total 
column Thus, the first column would be 

Unexpired Term . . . . 2 7 

Frequency with no correlation 6 x 6 x . . . 

and the remaining part of the table would be formed in a 
similar way Now as each column is formed in proportion to 
the total, the mean of each column must be the same as the 
mean of the total, which shows at once from tihe definition that 
no correlation can exist in such a table 

6 . The following table shows the figures exhTSitmg no 
correlation in ordinary type, and those actually occurring in 
small type Now, if these two sets of figures coincide exactly 
in any particular case, there is clearly no correlation m the 
table, if they differ slightly there is a slight amount, and if they 
differ greatly there is a considerable amount of correlation, and 
we come therefore to the conclusion that an alternative method 
of finding the correlation between two things is by measuring 

( 214 ) 



Central 
unespired 
term of 
Endowment 
Assurances 

Central ages at maturity 

Total 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

2 

1 

1 

3 

1 2 

114 

12 5 

214 

*7 6 

1 2 

2 

56 


2 



2 

26 

6 

14 

6 




7 

4. 

* 2 

1 0 

3 7 

35 0 

38 6 

65 8 

23 2 

36 

5 

172 


1 

1 

2 

6 

62 

36 

40 

22 

2 



12 

9 

6 

26 

93 

87 8 

96 8 

165 4 

58 4 

90 

1 2 

432 



2 

9 

17 

117 

99 

127 

52 

8 

1 


17 

1 4 

9 

39 

14 4 

135 3 

149 0 

254 4 

89 9 

13'9 

1 9 

665 


3 


6 

24 

145 

155 

237 

84 

11 



22 

1 4 

9 

40 

14 6 

137 2 

1510 

257 8 

91 1 

141 

1 9 

67l 



1 


3 

133 

167 

271 

78 

20 

1 


27 

1 1 

8 

32 

11 6 

109 5 

120 6 

205 9 

72 7 

112 

1 4 

538 





9 

90 

123 

231 

71 

11 

3 


32 

5 

4 

1 5 

53 

50 3 

55 3 

94 4 

33 4 

52 

7 

247 





1 

11 

49 

127 

49 

8 

2 


37 

2 

1 

5 

1 7 

15 7 

17 2 

29 4 

10 4 

16 

2 

77 







6 

49 

22 




42 

0 

0 

0 

2 

16 

18 

31 

1 1 

2 

0 

8 







2 

2 

3 


1 


47 

0 

0 

0 

0 

2 

2 

•4 

2 

1 

•0 

0 

1 

Total 

6 

4 

3 

62 

584 

643 

1,098 

388 

60 

8 

2,870 


the difference between the figures in the actual correlation 
table and those that would have arisen if there had not been any 
correlation In Chapter XI wediscussedamethodofmeasurmg 
the goodness of fit (or amount of agreement) between two sets 
of figures, and this suggests that we nught calculate by 
squaring the difference between each pair of figures m the table 
and dividing the result by the frequency when there is no 
correlation The reason for choosmg the figure from the table 
with no correlation as the divisor is that it always has a value, 
while the correla^iion table may give a frequency of zero 
7. As it is clear that will give a measure of the association, 
it will be interesting to see the coimection between it and the 
coefficient of correlation r, and the folio wmg proof shows that 
if the correlation table can be approximately represented by 
the normal correlation surface, then where the number of 
groupmgs is large 


r = 




1 + ^2 
^2 = y%j]^ 


( 215 ) 


where 


(8) 




Using the same notation as that of Chapter VIII, the 
frequency with no correlation is given by 

Z' = r 

, " 27rcr,(r2 

while that with correlation is 


Z = 


N 


27r^{l-r^)(r-iO-2 


21 — ?’“Vo‘i“ cTiO'o'^cTa^/ 


Tken 4,^ = f^” 


— ooj —00 


1 /’+“ /•+” /z^ \ 


If 1 r+00 ^+00 


V(i 


^)l- J-«' 

n +00 'I 

e-i(^'^+v'^)dx'dy'\ 

— 00 j 

where x' — xjcr-^ and y' = yjcr^ 


1 


1 


l-r2 ^(p+r2\2 4r2 

U-W (l-r2)2, 

2 


V(l-r2) 


J .2 1 


+ 1 


by (vi) of Appendix IV 


1 — 


• 2+1 


or 


1 _^2 


r = + 


4^ 


1 + 9^2 


( 216 ) 



8. The result just obtained may be considered a little more 
closely. 

(1) It shows that r must lie between — 1 and -f 1. 

(2) As the value of ^ not be affected by the order of the 
columns (or rows), it is permissible to\nterchange them, 
provided, of course, the whole column (or row) be moved 
at once. 

(3) The proof shows that r will not necessarily be obtained 
exactly if a very small number of groups is use*d, 
because by using the integral calculus an infimte number 
of groups was assumed 

(4) We also assumed, however, that we were deahng with 
smooth series, but as is ^ measure of the goodness of 
fit between the correlation and no-correlation figures, a 

^ large number of groups gives undue prommence to the 

chance deviations due to the use of a random sample, 
and the value of r found from that of may differ 
considerably from the value reached by the xy-moment 
Too fine a grouping may give a less accurate result than 
a less fine one 

9. These conclusions are borne out by practical work, and 
any student who cares to go into the subject can find the value 
of r by the two methods from a large table, using various 
groupings, and he will see that the best agreements are 
obtamed when the grouping is neither very fine nor very 
rough. But this general remark indicates a difficulty, for the 
student will naturally wonder how he is to group his figures in 
order t(?T8duce them to a suitable number of classes If he is 
dealing with facts distributed according to age, he can take 
groups of ten years instead of the finer groupmg of five or 
three years or he may lump together the small groups at the 
ends He will find that equal frequencies give better results 
than equal ranges when the material is divided into six (or less) 
classes, but when there are more than six classes equal ranges 
should be taken. This rule can only b^ applied broadly; we 

( 217 ) 



cannot from the nature of the data make exactly equal groups 
of our frequencies but must be content with somethmg 
approachmg equahty In order to indicate how we may ^ 
proceed and how the numerical work is done, the foUowmg 
table has been prepared from that of p 215 


r 


Central 

unexpired 

term 

Central ages at maturity 

Total 

50 and under 

55 

60 

65 and over 

2, 7, 12 

154 6 (247) 

147 9 (141) 

252 6 (181) 

104 9 (91) 

660 

17 

155 9 (178) 

149 0 (155) 

254 4 (237) 

105 7 (95) 

665 

22 

158 1 (137) 

151 0 (167) 

257 8 (271) 

107 1 (99) 

674 

27 

126 2 (99) 

120 6 (123) 

205 9 (231) 

85 3 (85) 

538 

32 and over 

78 2 (12) 

74 5 (57) 

127 3 (178) 

53 0 (86) 

333 

Total 

673 

643 

1,098 

456 

2,870 


10 . The totals are not all equal to one another the 1,098 
cases maturing at age 60 prevent this, but they are far more 
nearly equal than the totals in the original table. We now work 
out ‘}^ and find that its value is 198 8 * Hence 


2 -^ - 


and the coefficient of contingency is 254. This differs from the 
figures given for ^ in § 1| and both may differ from the r found 
by the method of Chapter VII, the original table does not 
follow sufficiently closely the mathematical form assumed 
There is, however, a general difficulty apart from any pecu- 
liarity of an individual case, for we can never reach a coefficient 
of umty because, with a finite number of groups, never 

become infinite which is necessary if r is to be unity Similarly 
there is a tendency to mis-state the value of r by the method 


* To make this more easy to follow we may mention that the contributions 
to from the first column are 55 1, 3 1, 2 8, 5 8 and 56 0 

t It happens to agree with r from Chapter VII In the particular case the 
errors from broad grouping and from deviations from the assumed form happen 
to balance The agreement is an illustration of the danger of generalismg from 
isolated cases r 


( 2i8 ) 




of contingency when r has other values and this depends to 
some extent on the groupmg of the material Adjustments 
which are of a fairly simple nature should be made 

11. In § 7 when we worked out the connection between r and 
we assumed that the frequencies took thelbrm of the normal 
correlation siijJface This means that we assumed that the 
totals of the columns and rows are “normal curves of error ” 
Let us suppose that we have no finer groupmg than that given 
m the table in § 9, then the totals of the columns are 673, 643 
1,098 and 456, making a total of 2,870, or reducing them to 
a total frequency of umty, we have 2345, 2240, -3826 and 
•1589 From tables of the “normal curve”* we can work out 
the ordinates at the end of each group of frequency and form 
the following table 


Group 

fre- 

quency 

(1) for 
unit fre- 
quency 
n 

Total area 
from 

begmnmg 
(by adding 
(2)) 

Ordmate 

at 

beginning 
of group 
z 

Difference 

of 

2 ’s 

negatively 

Col (5) 
Squared 

(6)/(2) 

(1) 

(2) 

(3) 

(4) 

(5) 

(6) 

(7) 

From the columns 

673 

2345 

2345 

00000 

- 30694 

09421 

402 

643 

2240 

4585 

30694 

- 08984 

00807 

036 

1,098 

3826 

8411 

39678 

+ 15456 

02389 

062 

456 

1589 

1 0000 

24222 

00000 

-f 24222 

05867 

369 

2,870 

10000 





869 




Squaie root 

= 932 j 

ip 

From the rows 








660 

2300 

2300 

00000 

- 30365 

09220 

401 

665 

2317 

4617 

30365 

- 09345 

00873 

038 

674 

2348 

6965 

39710 

-f- 04759 

00226 

010 1 

538 

1875 

8840 

34951 

+ 15421 

02378 

127 

333 

1160 

10000 

19530 

00000 

+ 19530 

03814 

329 

2,870 

1 0000 





905 





Squaie root =951 


* Tables for Statisticians, Part Table ii 

( 219 ) 




The final figures are the square roots of 


% ^2 


«3 


4 ” etc 


where z^, etc are the ordinates at the beginning of 

successive groups and etc are thp proportionate 

frequencies, i e the successive terms in the preceding table 
col (2). The corrected value is 


•254 

•932 X *951 


•286 


12 . The first three columns of the table in the preceding 
paragraph are easily constructed, the third is wanted because 
tables of the areas of the normal curve ” give those areas from 
any point up to the end of the curve, i e the integral from x to 
cx) The next column gives the ordinate which can be found ^ 
from the tables in Tables for Statisticians, Part i, where the 
ordinates and areas are m parallel columns, or directly from 
the tables in Part ii 

We will now turn to the theoretical side and may consider 
what is the mean of each of the areas n^, etc , say, of 

It will be I xe-^^^dx / f e-~^^^dx 
= / f * e-^^dx 

I J Xg Yx 

= (2s-2s+i)K+1 

Hence this expression gives us the distance of the mean value 
of the area from the mean of the whole distribution But 

frequency and therefore 



when summed for all values of s gives the second moment of the 
distribution and the adjustment, being the square root of a 
second moment, is a standard deviation. 

We had assumed the standard deviations to be unity* we 

( 220 ) 



liave now recalculated them on the facts available and ad- 
justed the result This adjustment in effect removes to a large 
extent the objections indicated in § 10 

13. It IS a httle difficult to judge the necessity or success of 
an adjustment in a case of this kind unless j^e know the value 
of the correlation which we ought to reach, and it wiU probably 
be more convilicmg to take one of the coin-tossmg tables and, 
having grouped it, see what values of r are found by the 
contingency method without adjustment and how near we get 
to the true value of r with adjustments. For this purpose tLe 
table where five coins were common to the pairs of tossmgs was 
used and a table was formed as follows 


No of heads 
m second 
tossing 

No OF HEADS IN FIRST TOSSING 

Total 

0-4 

5-6 

7-10 

0-3 

2,123 (3,906) 

2,541 (1,606) 

968 (120) 

5,632 

4 

2,534 (3,360) 

3,031 (2,880) 

1,155 (480) 

6,720 

5 

3,038 (2,906) 

3,640 (4,032) 

1,386 (1,126) 

8,064 

6 

2,d34 (1,680) 

3,031 (3,560) 

1,155 (1,580) 

6,720 

7-10 

2,123 (600) 

2,541 (2,706) 

968 (2,326) 

5,632 

Total 

12,352 

14,784 

5,632 

32,768 


The zero-contingency figures are in brackets 
$52 = 6971/32768 = -2127 
T (unadjusted) = -418 

Then working the adjustment as before, -953 was found for the 
rows and *892 for the columns, so that the adjusted value of 
T is •418/(-953x«92) = -492. 

Ano-y?^ trial may be made with the same com-tossmg table 
throwing it into the form 



0-4 

5 

6-10 

Total 

0-4 

7,266 

2,906 

2,180 

12,352 

5 

2,906 

2,252 

2,906 

8,064 

6-10 

2,180 

2,906 

7,266 

12,352 

Total 

12,352 

8,064 

12,352 

32,768 


( 221 ) 




Here $ 52 = .171. 


r (unadjusted) — *382 

the factor for rows is ’872 and for columns is the same, so that 
the adjusted value for ns 503. 

Now both those*^ should be 5, but clearly 492 and 503 are 
good approximations with broad groupings, and the examples 
show both the importance of the adjustment and the accuracy 
attamable. 

^14. There is, however, one more aspect of this kind of 
adjustment to which reference may be made We remarked 
(§ 8 ) that the method of contingency imphed that we could 
change the order of the columns and rows, but if we do this, 
what will happen to the adjustments? The point is of some 
importance In broad groups where the division is not quanti- 
tative, we may not be sure that if we could express the scale 
quantitatively it would give a distribution of anything hke the 
assumed normal curve. Let us put this to the test by taking 
the grouped figures from one of the tables in the preceding 
paragraph and rearranging them arbitrarily 

Thus we might produce 


Second 

Fiest chaeacteeistio 

1 

characteristic 

«0 

«i i 

^2 

\ 

4,032 

1,126 

2,906 

8,064 

h 

3,560 

1,580 

1,580 

6,720 

h 

2,880 

480 

3,360 

6,720 

h 

2,706 

2,326 

600 

5,632 

64 

1,606 

120 

3,906 

5,632 

Total 

00 

5,632 

12,352 

^3,^768 


Clearly and the unadjusted r remain unchanged and so we 
are only concerned with the totals and the procedure of § 11 . 
Working with these we reach 944 and -856 as the factors by 
which we adjust and so find a value of 517 for r This is better 

* The underlying theory of the adjustment is that a normal frequency surface 
could be cut up to give the table This could not, I think, be done in the particular 
rearrangement But the adjustments work well 

( 222 ) 




than the unadjusted value — ^in fact, quite good. The explana- 
tion IS that however we divide up the numbers we get adjust- 
ments which will not vary to an extreme extent unless the 
groupmg IS exceptional. 

15. We have already seen that double en^ry tables will show 
small values for a measure of correlation even when there is 
really no correlation and that it is generally more important to 
decide whether the apparent correlation is sigmficant than to 
measure exactly the standard error of its coefficient. All we 
need to do m considermg a standard error for is therefore 
to compare the actual table with a table formed assuming no 
correlation and see if the divergence is significant In practical 
work it IS advisable to make this test before working out the 
coefficient Taking, for instance, the table on p 2lS,x^ = 198*8 
and we need to find a value for the probabihty of a divergence 
^ as great as or greater than that mdicated on the assumption 
that there is no correlation and that the particular table has 
arisen merely in samphng. 

There are 20 cells in the table but as we fix the totals of each 
row and each column this would be too large a number to use 
for n' The correct number of free cells is (A — 1) (A — 1) where 
h IS the number of rows and k the number of columns. In the 
particular case (A. — 1) — 1) = 12 In Tables for StatisUcians, 

Part I, Table xn, n' is used as one more than the number of 
free cells, i.e. = n+1, and we must therefore enter that table 
with ?^' = 13 and yf = 198*8 Kmowing the value of r from our 
previous calculations, it is not surprising to find that the chance 
of such a diverg^ce from zero correlation is zero to at least 
SIX decip^i^l places 

In a fourfold table there is only one free ceU 

Another way of setting out the method described m this 


section is to say that 

(i) the mean - (A— 1) (4— 1) (9) 

(li) ^{2{1i-l){h-l)}lN ....(10) 

when there is no correlation. 

( 233 ) 



Formulae (9) and (10) set out the result in a form similar 
to that already given for 7j^ in formulae (4) and (5). 

16. In working at contingency we have up to the present 
assumed that we calculate the squares of the differences between 
the actual figures ^n the table and the corresponding figures 
when there is no correlation, but we may proceed by adding 
together the differences regardless of sign We then obtain the 
mean of these by dividing by the total number of cases 
A diagram in Tables for StaUsUcians, Part i, gives values of 
r The mathematical work leading to this method is more 
difficult than that for the mean square contingency given 
above and in practical work the latter is more dependable. 

17, There is yet another method of estimating correlation 
that may be of help It is known as correlation of ranks and 
was suggested by Spearman.* By this method we estimate the 
correlation between, say, the height and span of a number of 
schoolchildren without making an exact measurement for any 
child We first stand the children in order of height and number 
them in the rank, the shortest being numbered 1 and the 
taUest n Then we rearrange the children in order of span and 
again number them* the child with the shortest span being 
numbered 1 and the child with the longest span n The child 
numbered 1 in the height rank might be 3 in the span rank and 
so on 

The next step is to calculate the sum of the squares of the 
differences between the ranks, say 8{d^), for the two characters 
(one element in our height and span example would be 
(3 — 1)2 or 4) and we can write 




^S{d^) 




( 11 ) 


r = 2sm-i2 

D 


( 12 ) 


where R is the coefficient of correlation between ranks and r 


* C Spearman, Amer J Psychol xv, 72, K Pearson, Drapers' Company 
Memoir, No 4 

( 224 ) 



the corresponding coefficient between variates — similar to 
that discussed in previous chapters The relationship depends 
on the assumption of the normal correlation surface 

The standard error of r found by this method is approxi- 
mately 5 per cent greater than that found by the product- 
moment meth^ of Chapter X 



r 


CHAPTER XIII 


PARTIAL CORRELATIOST 

1. We have up to the present assumed that we can always 
deal with pairs of related thmgs, but in many investigations, 
especially perhaps m social statistics, the problems are compli- 
cated by a greater number of variables Suppose, for instance, 
we were makmg a study of infant deaths and trying to ascertain 
the causes chiefly responsible for a high death-rate, we might 
examine the home environment of children in a particular 
district to see whether there was any relation between infant 
deaths and the habits of the mother But the health of the 
mother may be important also, and if we find correlation 
coefficients in respect of ( X) infant deaths and habits of mother 
and (2) infant deaths and health of mother, we have up to the 
present found no way of eliminating the possible relation 
between health and habits of the mother In other words, if the 
cause of infant mortality is connected with the habits of the 
mother, is it merely so connected because health and habits 
are connected 

2. Let us proceed as we did in dealing with correlation where 
there are only two variables and assume that 

etc be associated deviations, and let ^ 

2 ; = a + bx + cy ^ ^ ( 1 ) 

As before, we can omit a if we measure every variable from its 
mean. Then using methods of moments we have 

{bx-^ ■+■ cy^) + {bx^ + cy^) -h =Zj_-i-Z2-h . 

(tel 4- c^/i) a?! 4- (teg 4- cy2) x^-h • = Xj^Zj^-j-x^z^-i- • . 

or bjS(x^) + cjS(xy) = jS(xz) . .(2) 

Similarly b^S(xy)-hcjS(y^) = S(yz) . (3) 

( 226 ) 



Now, slightly altering the notation used on p 145, we 
write 

8{xy) = Nar^cTyr^ 

S{x^) = Nal 
S{y^) = Na^ 

hence 8{xz) = Na^cr^r^ 

and Siyz) = Na-yO-^ry, 

Substituting in (2) and (3), we have 
bcrl, + ca^cryr^ = 


or 

bcr^ + ca-yr^y = cr^r, 

and 

bo'x'^'xy 

+ co-j, = cr^r, 

hence 

II 

'^XZ 


<^x 

2 — ^2 I 

and 


'^yz~~'^xz'^xv 1 



1— r® I 

^ ' xy J 


... (4) 


Substituting in (1) and remembering that a == 0, we have 

Z = ~ I '^yz "" '^xz^xy ^ 

1-^xy ^y i-r^y ^ 

Now when dealing with the two variables we expressed the 
result 

m r% y — r—x\ 

. 

X = T~y 

SO that r the measure of the correlation is the geometric mean 
of the coefScients 

O ' o 1 0*1 

T~ and r— 


( 227 ) 


15-2 



Similarly with three variables we can write down x in terms of 
z and 2 / or y in terms of x and z, and, again, using the geometric 
mean of the appropriate pairs of coefficients, we have 

r 

as the net (or partial) coefficient between x and z associated with 
a single t 3 ^e of y The square root in the denominator is to be 
taken as positive 

3. Now coefficients of correlation must not exceed unity, 
therefore 

v)® >( 1 - ‘^ly) ( 1 - '^Iz) 

or, must he between the hmits 

'^xy'^yz i ^ ^ 

From this we can write down some of the limits that may arise 


when we are dealing with three variables 

If 

Then 

'^xy 

ryz = 

*^XZ ~ 

0 

0 

any value 

1 

1 

1 

-1 

-1 

1 

1 

-1 

-1 

0 

±1 

0 

0 

±r 

between ± ^(1 — r^) 

r 

r 

1 between 1 and 

— r 

— r 

J 2r2-l 

r 

— r 

between 1 — and — 1 


4. We may now consider the following numerical example * 

* “ Relative value of factors influencing infant welfare ”, Annals of Eugenics, i, 
178-9 The statistics quoted are from Bradford, 1911 The student will find 
many similar sets of tables »*n this paper 

( 228 ) 



Habits of 
Mother 
(x) 

Health of Motheb 
{y) 

Total 

Good 

Hot good 

Good i 

Indifferent 

956 

257 

197 

286 

1,153 

543 

Totaf 

1,213 

483 

1,696 


T = *567 + *033 


Child dead 

Habits of Mother 
(*) 


or not 

Total 

( 2 ) 

Good 

Indifferent 


Living 

997 

420 

1,417 

Dead 

156 

123 

279 

Total 

1,153 

543 

1,696 


^3.^= *213 ±*046 


Child dead 

Health op Mother 

(y) 


or not 

Total 

(^) 

Good 

Not good 


Living 

1,065 

352 

1,417 

Dead 

148 

131 

279 

Total 

1,213 

483 

1,696 


329 ±*045 


Let us now workout our partial coefficient between ''Habits 
of Motiter^’ and "Infantile Deaths” for constant "Health of 
Mother”, and we have 

^ .213-(*329)( 567) 

^ ^{• 891 )^(^ 678 ) 

In other words the value, though it looked like * 2 1 3 at first, now 
proves to be only 034, and as the standard error is about *04 

we could not say that the result is significant. 

% 

( 229 ) 






If we worked out the partial coefficient between ''Health of 
Mother’’ and "Infant Deaths”, keeping "Habits” constant, 
we reach a value of *26, which is significant though smaller 
than the crude figure of • 329 

5. It is possible to extend the theory to a larger number of 
variables, but it seems unnecessary to do so her^ The example 
wiU give an indication of the use to which such Vork may be 
put, and supplies a warning against accepting the numerical 
value of a coefficient until other causes that may affect the 
result have been considered. 


( 230 ) 



APPENDIX I 


CORRECTIONS POR MOMENTS 

1 . The following method has been suggested by E. Pairman 
and K Pearson {Biometriha, xn, 231 etseq ) when the curve 
rises abruptly at one or both ends 

Let n-i, ^25 proportionate frequencies in the 1 st, 

2 nd, etc groups, then put 

% = ““•6^Q{1377^l — 16372,2+ 137% — 639^44- 12 ?z,5} 

% = { 45 % — 109 % + 105 % — 51 %+ 10 %} 

^ 3 = ^^^2 + 64 % — 34 % + 7 %} 

% = (3% — 1 172,2 + 1572-3 ” + 2%} 

— 4712 + ^^3 ■“ "^^4 + '^ 5 } 

Similarly, values of etc can be obtamed from the 
other end of the distribution 

Then the values of the moments are as follows, where A is 
the distance of the start and B is the distance of the end of 
the distribution from the origin about which moments are 
calculated 

M'l ^1 d- 2 52 0%) A(^l *“■^^3 252 ^^ 5 )} 

/^2 ” ^2""Tl"d‘{ l i o' (^2 ~ 12 6 ^4) d" ^A (% "^^3 "^ 262 6 ^s)} 

/^3 ~ A%d- 2 io%) 

- ^A (% - xfe %) + i (% - 

/^4 = ^4 ~ i ^2 d- ^ + (lie (<^2 ““ 8 ^^ 4 ) - A- A (% - % + ^%) 

“ AA^(%~Tf6’^4) +iA^(%— 25^20^^)} 

and similar expressions in B and 6 ’s 

( 231 ) 



If the moments be taken about the start of the first group so 
that the first group is multiplied by powers of I, the second by 
powers of | and so on, this expression is simplified so far as the 
a terms are concerned because the terms involving A vanish 

2. The method oSreaching these adjustments starts with the 
Euler-Maclaurm expansion and assumes that 1;ihe curve takes 
the form 

l + + etc 

atThe beginning and a similar form at the end This leads to the 
values of the a's 

The differential coefficients at each end required m the Euler- 
Maclaurin expansion are then evolved and the result given is 
reached 

The frequency at the start is approximately 
N 

4- lB 7 n ^ - -f 12^5} 

By moans of this expression we can discover how nearly the 
frequency curve comes to zero at the ends of the range 

3 . A few numerical examples may be given The rule that the 
area, in the case of high contact, can be found by adding 
ordinates when tested by adding 12 ordinates of the normal 
curve calculated to 5 decimal places, gave 1 24998 instead of 
] -25000 Nine ordinates of a Type III curve with high contact 
gave 24473 instead of 24475 

An example of the method of § 1 above is taken from the 
paper there cited. Moments for ^Jxx 100,000 from x — 0 to 
a; = 10 were calculated, the exact result Semg known. The 
proportional frequencies, which may be taken ar tEe data, 

m = -031623 rig = -111205 
= -057820 = 120904 

»3 = -074874 rig = 129880 

= -088665 rig = 138273 

«5 = -100571 Jiio = 146185 

1-000000 


{ 232 ) 



Trom these figures 

Oi = --0131,0643 -1499,9857 

aj = --0444,8167 63 = --0074,9283 

^3 = -0258,4150 63 = --0003,9450 

0)4 =- 0148,8400 64 = --0000,2600 

«s ^ -0045,0200 65 = - 0000,3800 

«i-^a3 + 2W«5 = --0135,3533 
®2'~Tf¥®4 = --0438,9104 and so on 

Putting ^ = 0 and £ = 10 and calculating moments about 
the start, we require for the a adjustments 

AK-^aa + ^as) = - 0011,2794 
and the other adjustments m order are 

- 0003,6576, - 0003,7846 and --0003,4269 
For the b terms we have 

+ = -0121,0043 

A^ad" 1/20^5) — -2500,0860 
and the other terms in order give 

3 7501,2825, 50-0017,1000 

- 0000,6244, --0001,8731 

--0374,6365, -0037,5074 

■1500,2972, --0000,6945 

Finally for the adjusted moments we reach 



-»*l 

Baw moments 

♦ 

— 

With Sheppard’s 
adjustment 

With full 
adjustment 

True value 



5 9880 

5f9880 

5 9994 

6 0000 



42 6900 

42 6067 

42 8570 

42 8571 


^"3 

331 0854 

329 5884 

333 3349 

333 3333 

^4 


2698 7735 

2677 4576 

2727 2757 

2727 2727 


4. The method described above gives good results but is 
laborious. The approximations are less satisfactory m those 
cases where the first group does not relate to a complete umt 

( 233 ) 




base and the curve rises abruptly. The same authors gave 
a method for J -shaped distributions, but I should not use it 
as a simple approximation can be found by examining the 
exponentia] (Type X) 

When statistics expressible by the exponential y ~ 
are stated in groups for each equal subrange h of x, the 

successive groups are y^e^^l^dx, yQe~-^i^dx, etc , or 
Jo' J h 

yQ(r{ 1 , yQ(r{ 1 — yQcr{ 1 — etc 

These terms may also be regarded as a geometrical progression, * 
the first term being yQ(r{l — and the common ratio 
It follows that if we treat the areas as a geometrical progression 
extending to infinity, calculate the moments on this assump- 
tion and read the result as graduated terms of a geometrical 
progression, wo shall reach correctly graduated areas, and we 
can subsequently write down the equation to the curve with 
hl/tlo trouble 

Other pomts are however involved. Let us write the 
gcomotncai progression as ka'^ and put A == (1 — then the 
moments about its mean arc 

2nd moment A^ — A 

3rd „ 2A^-ZA^ + A 

4th „ 9A^-UA^+10A^-A 

and if wo work out and wo get 44-A2//^2 Q + 
respectively. 

Using the exponential, the moments, etc about the mean 
are. //g = /i^ = 2(T^, = 9cr^, = 4, /?2 = 9. 

Hence when wo calculate moments, assuming that the 
statistics form a geometrical progression, whereas they are 
really areas from a curve, and seek to choose the type of curve 
from Pearson’s criteria in his system, we shall reach a persistent 
error For this purpose the and /?2 found from the statistics 
should be reduced by 

* “Geometrical pi egression” is used throughout to desciibe a discrete series 
and exponential curve to deseiibo a continuous one 

( 234 ) 



This rule can be used as an approximation in all d -shaped 
curves and will be found to give satisfactory results. 

So far we have assumed that we know the start of the curve 
and that all the bases of the areas are of equal size If this does 
not apply we can, in the case of an expone^jtial curve, fit the 
curve, excluding the first (incomplete) term, and regard that 
term as related to an appropriate base extrapolated from the 
graduation of the remainder This is an arbitrary arrangement 
but has practical advantages 
In other d -shaped curves m similar circumstances the first 
step would be to assume an exponential, to find therefrom 
approximately the base of the first incomplete group, and then 
assume that the area is concentrated at the middle pomt. This 
will generally give good results the assumption of the expo- 
nential overstates the base and the assumption of half-way 
^ assumes a less rapidly fallmg curve than the d -shaped forms 
of Types I and III There is therefore a balance of error. 

Turning to the statistical side, the example on p. 108 gives 
/^2 = 2 045, /?i = 4 629, /?2 ~ 9*502 These figures come from 
the unadjusted moments, and deductmg *49 from the above 
values for and /?2 we reach 4 14 and 9*01 The theoretical 
values when an exponential curve is to be used are 4 and 9 
If we apply the rule as an approximation in other d -shaped 
cases we find that m the example on p 112, where a twisted 
d-shaped curve is given, = 4*266, /?i = *761, /?2 = 2 646, and 
the adjustment leads to = *527 and ^^ = 2 412 Hence 
5/?2 — — 9 becomes — *098 instead of — *368 The theoretical 

criterion would lead us to expect 5/?2-“ 6/?i- 9 = 0 

Thes^ ea^aniples are not, of course, complete evidence, but 
they show that the suggestion may lead to accurate results, 
and it has the merit of simphcity. The rule with regard to the 
adjustment of the /?’s by may be combined with the 

approximations given on p 109, where it is mentioned that the 

^2 

mean is overstated, when is positive, by about and 
the second moment about the true mean (i e the mean as 

( 235 ) 



corrected by A7(12cr)) is understated by about — We do not 

know a exactly but can use the square root of the second 
moment as found from the calculations If h be taken as a 
unit and the moj^nents found m terms of h, i.e in working 
units, the corrections are l/(l 2 ^// 2 ) and ^ 

5 . An alternative to the method of § 1 is to find mid-ordinates 
corresponding to the areas of the groups and treat these mid- 
ordmates in the manner explained in Chapter III, § 18 
" The mid-ordinates etc are found by the following 

equations 

^3 ~ 19 ^ 0 1 l-b(?2/2 + + ^5)} 

mo = x^2 0 { ~ "h 20447^2 “ 26^3 — 86714 + 9 %} 

niy = + 684?/, 0-” 746^3 + 364^4 — 71 ^ 5 } 

Tlie total frcquoiuy is not exactly reproduced but the moments 
obtained are good approximations 

6* Jt Inus been pointed out tliat one of the difficulties in 
(jalculatmg moments when the curve rises abruptly at one or 
both ends arises hocause the true start or end of the curve 
IS unknown In other words, the base of tlie first area or last 
area (or both) is smallor than that of the other areas In 
practu^e good results can often bo obtained with unadjusted 
moments but the first attempt may reipiire modification by 
varying the range of the curve (see p 124). When this is 
done, the moment contribution for the first area, or the last, 
or both, must bo recalculated by assuming that the area is 
concentrated at the middle of the sraalleimase 

K, fS. Martin has approached the problem more systematic- 
ally in a paper in Bio'metr%l:a, xxiv, 12, and has given tables 
from which the start of the curve may be estimated. 


( 236 ) 



